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Abstract: These lecture notes present some new concentration inequalities 
for Feynman-Kac particle processes. We analyze different types of stochastic 
particle models, including particle profile occupation measures, genealogical tree 
based evolution models, particle free energies, as well as backward Markov chain 
particle models. We illustrate these results with a series of topics related to com- 
putational physics and biology, stochastic optimization, signal processing and 
bayesian statistics, and many other probabilistic machine learning algorithms. 
Special emphasis is given to the stochastic modeling and the quantitative perfor- 
mance analysis of a series of advanced Monte Carlo methods, including particle 
filters, genetic type island models, Markov bridge models, interacting particle 
Markov chain Monte Carlo methodologies. 
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Sur les proprietes concentrales de processus 
particulaires d'interaction 

Resume : Ces notes de cours presentent de nouvelles inegalites de concen- 
tration exponentielles pour les processus empiriques en interaction associes a 
des modeles particulaires de type Feynman-Kac. Nous analysons differents 
modeles stochastiques, notamment des mesures d'occupation courante de pop- 
ulation genetiques, des modeles historiques bases sur des evolutions d'arbres 
genealogiques, des estimations d'energies libres, ainsi que des modeles de chaines 
de Markov particulaires a rebours. Nous illustrons ces resultats avec une serie 
d'applications liees a la physique numerique et la biologic, I'optimisation sto- 
chastique, le traitement du signal et la statistique bayesienne, avec de nombreux 
algorithmes probabilistes d'apprentissage automatique. Un accent particulier 
est donne a la modelisation stochastique de ces algorithmes de Monte Carlo, et 
a I'analyse quantitative de leurs performances. Nous examinons notamment la 
convergence de filtres particulaires, des "Island models" de type genetique, des 
processus de pouts markoviens, ainsi que diverses methodes de type MCMC en 
interaction. 

Mots-cles : Proprietes de concentration exponentielle, processus empiriques 
en interaction, interpretations particulaires de formules de Feynman-Kac. 
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1 Stochastic particle methods 

1.1 Introduction 

Stochastic particle methods have come to play a significant role in applied proba- 
bility, numerical physics, Bayesian statistics, probabilistic machine learning, and 
engineering sciences. 

They are increasingly used to solve a variety of problems. To name a few, 
nonlinear filtering equations, data assimilation problems, rare event sampling, 
hidden Markov chain parameter estimation, stochastic control problems, finan- 
cial mathematics. There are also used in computational physics for free energy 
computations, and Schrodinger operator's ground states estimation problems, 
as well as in computational chemistry for sampling the conformation of polymers 
in a given solvent. 

Understanding rigorously these new particle Monte Carlo methodologies 
leads to fascinating mathematics related to Feynman-Kac path integral the- 
ory and their interacting particle interpretations ^21 US US] ■ In the last two 
decades, this line of research has been developed by using methods from stochas- 
tic analysis of interacting particle systems and nonlinear semigroup models in 
distribution spaces, but it has also generated difficult questions that cannot be 
addressed without developing new mathematical tools. 

Let us survey some of the important challenges that arise. 

For numerical applications, it is essential to obtain non asymptotic quan- 
titative information on the convergence of the algorithms. Asymptotic theory, 
including central limit theorems, moderate deviations, and large deviations prin- 
ciples have clearly limited practical values. An overview of these asymptotic 
results in the context of mean field and Feynman-Kac particle models can be 
found in the series of articles [H [Ml [SIl IH SB HZ] • 

Furthermore, when solving a given concrete problem, it is important to ob- 
tain explicit non asymptotic error bounds estimates to ensure that the stochastic 
algorithm is provably correct. While non asymptotic propagation of chaos re- 
sults provide some insights on the bias properties of these models, they rarely 
provide useful effective convergence rates. 

Last but not least, it is essential to analyze the robustness properties, and 
more particularly the uniform performance of particle algorithms w.r.t. the 
time horizon. By construction, these important questions are intimately related 
to the stability properties of complex nonlinear Markov chain semigroups as- 
sociated with the limiting measure valued process. This line of ideas has been 
further developed in the articles [HI |39l [23l |44] , and in the books [251126] . 

Without any doubt, one of the most powerful mathematical tools to analyze 
the deviations of Monte Carlo based approximations is the theory of empirical 
processes and measure concentration theory. In the last two decades, these new 
tools have become one of the most important step forward in infinite dimensional 
stochastic analysis, advanced machine learning techniques, as well as in the 
development of a statistical non asymptotic theory. 

In recent years, a lot of effort has been devoted to describing the behavior of 
the supremum norm of empirical functionals around the mean value of the norm. 
For an overview of these subjects, we refer the reader to the seminal books of 
D. Pollard [75], and the one of A.N. Van der Vaart and J. A. Wellner [HI], and 
the remarkable articles by E. Gine [53], M. Ledoux [57], and M. Talagrand [531 
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[551 [55] , and the more recent article by R. Adamczak [T] . The best constants 
in Talagrand's concentration inequahties were obtained by Th. Klein, and E. 
Rio [53]. In this article, the authors proved the functional version of Bennett's 
and Bernstein's inequalities for sums of independent random variables. 

To main difficulties we encountered in applying these concentration inequal- 
ities to interacting particle models are of different order: 

Firstly, all of the concentration inequalities developed in the literature on 
empirical processes still involve the mean value of the supremum norm empirical 
functionals. In practical situation, these tail style inequalities can only be used 
if we have some precise information on the magnitude of the mean value of the 
supremum norm of the functionals. 

On the other hand, the range of application the theory of empirical pro- 
cesses and measure concentration theory is restricted to independent random 
samples, or equivalently product measures, and more recently to mixing Markov 
chain models. In the reverse angle, stochastic particle techniques are not based 
on fully independent sequences, nor on Markov chain Monte Carlo principles, 
but on interacting particle samples combined with complex nonlinear Markov 
chain semigroups. More precisely, besides the fact that particle models are built 
sequentially using conditionally independent random samples, their respective 
conditional distributions are still random. In addition, they strongly depend in 
a nonlinear way on the occupation measure of the current population. 

In summary, the concentration analysis of interacting particle processes re- 
quire the development of new stochastic perturbation style techniques to control 
the interaction propagation and the independence degree between the samples. 

The first article extending empirical processes theory to particle models is 
a joint work of the first author with M. Ledoux |43) . In this work, we proved 
Glivenko-Cantelli and Donsker theorems under entropy conditions, as well as 
non asymptotic exponential bounds for Vapnik-Cervonenkis classes of sets or 
functions. Nevertheless, in practical situations these non asymptotic results 
tend to be a little disappointing, with very poor constants that degenerate 
w.r.t. the time horizon. 

The second most important result on the concentration properties of mean 
field particle model is the recent article of the first author with E. Rio [33] . This 
article is only concerned with finite marginal model. The authors generalize the 
classical Hoeffding, Bernstein and Bennett inequalities for independent random 
sequences to interacting particle systems. 

In these notes, we survey some of these results, and we provide new con- 
centration inequalities for interacting empirical processes. We emphasize that 
these lectures don't give a comprehension treatment of the theory of interact- 
ing empirical processes. To name a few missing topics, we do not discuss large 
deviation principles w.r.t. the strong r-topology, Donsker type fluctuation theo- 
rems, moderate deviation principles, and continuous time models. The first two 
topics and developed in the monograph [55], the third one is developed in |40| . 
the last one is still an open research subject. 

These notes emphasize a single stochastic perturbation method, with second 
order expansion entering the stability properties of the limiting Feynman-Kac 
semigroups. The concentration results attained are probably not the best pos- 
sible of their kind. We have chosen to strive for just enough generality to derive 
useful and uniform concentration inequalities w.r.t. the time horizon, without 
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having to impose complex and often unnatural regularity conditions to squeeze 
them into the general theory of empirical processes. 

Some of the results are borrowed from the recent article |44| . and many 
others are new. These notes should be complemented with the book [JS], the 
books |251 [2S1 J and the article [35] • A very basic knowledge in statistics and 
machine learning theory will be useful, but not necessary. Good backgrounds 
in Markov chain theory and in stochastic semigroup analysis are necessary. 

We have made our best to give a self-contained presentation, with detailed 
proofs assuming however some familiarity with Feynman-Kac models, and basic 
facts on the theory of Markov chains on abstract state spaces. Only in sec- 
tion [53111 we have skipped the proof of some tools from convex analysis. We 
hope that the essential ideas are still accessible to the readers. 

It is clearly not the scope of these lecture notes to give an exhaustive list 
of references to articles in computational physics, engineering sciences, and ma- 
chine learning presenting heuristic like particle algorithms to solve a specific 
estimation problem. Up to a few exceptions, we have only provided references 
to articles with rigorous and well founded mathematical treatments on particle 
models. We already apologize for possible errors, or for references that have 
been omitted due to the lack of accurate information. 

These notes grew from series of lectures the first author gave in the Com- 
puter Science and Communications Research Unit, of the University of Luxem- 
bourg in February and March 2011. They were reworked, with the addition 
of new material on the concentration of empirical processes for a course given 
at the Sino-French Summer Institute in Stochastic Modeling and Applications 
(CNRS-NSFC Joint institute of Mathematics), held at the Academy of Mathe- 
matics and System Science, Beijing, on June 2011. The Summer Institute was 
ably organized by Fuzhou Gong, Ying Jiao, Gilles Pages, and Mingyu Xu, and 
the members of the scientific committee, including Nicole El Karoui, Zhiming 
Ma, Shige Peng, Liming Wu, Jia-An Yan, and Nizar Touzi. The first author is 
grateful to them for giving to him the opportunity to experiment on a receptive 
audience with material not entirely polished. 

In reworking the lectures, we have tried to resist the urge to push the analysis 
to general classes of mean field particle models, in the spirit of the recent joint 
article with E. Rio [44]. Our principal objective has been to develop just enough 
analysis to handle four types of Feynman-Kac interacting particle processes; 
namely, genetic dynamic population models, genealogical tree based algorithms, 
particle free energies, as well as backward Markov chain particle models. These 
application models do not exhaust the possible uses of the theory developed in 
these lectures. 

1.2 A brief review on particle algorithms 

Stochastic particle methods belong to the class of Monte Carlo methods. They 
can be thought as an universal particle methodology for sampling complex dis- 
tributions in highly dimensional state spaces. 

We can distinguish two different classes of models; namely, diffusion type 
interacting processes, and interacting jump particle models. Feynman-Kac par- 
ticle methods belongs to the second class of models, with rejection-recycling 
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jump type interaction mechanisms. In contrast to conventional acceptance- 
rejection type techniques, Feynman-Kac particle methods are equipped with an 
adaptive and interacting recycling strategy. 

The common central feature of all the Monte Carlo particle methodologies 
developed so far is to solve discrete generation, or continuous time integro- 
differential equations in distribution spaces. The first heuristic like description 
of these probabilistic techniques in mathematical physics goes back to the Los 
Alamos report [52], and the article [S2] by C.J. Everett and S. Ulam in 1948, 
and the short article by N. Metropolis and S. Ulam [73, published in 1949. 

In some instances, the flow of measures is dictated by the problem at hand. 
In advanced signal processing, the conditional distributions of the signal given 
partial and noisy observations are given by the so-called nonlinear filtering equa- 
tion in distribution space (see for instance |19l ?IU[ [531 123 US] , and references 
therein). 

Free energies and Schrodinger operator's ground states are given by the 
quasi-invariant distribution of a Feynman-Kac conditional distribution flow of 
non absorbed particles in absorbing media. We refer the reader to the articles 
by E. Cances, B. Jourdain and T. Lelievre [S], M. El Makrini, B. Jourdain and 
T. Lelievre [SD], M. Rousset [53], the pair of articles of the first author with L. 
Miclo |24l I23| . the one with A. Doucet [29], and the monograph [25], and the 
references therein. 

In mathematical biology, branching processes and infinite population models 
are also expressed by nonlinear parabolic type integro-differential equations. 
Further details on this subject can be found in the articles by D.A. Dawson and 
his co-authors [15j (H [H] , the works of E.B. Dynkin [49], and J.F. Le GaU [68], 
and more particularly the seminal book of S.N. Ethier and T.G. Kurtz [HI], and 
the pioneering article by W. Feller |54| . 

In other instances, we formulate a given estimation problem in terms a se- 
quence of distributions with increasing complexity on state space models with 
increasing dimension. These stochastic evolutions can be related to decreasing 
temperature schedules in Boltzmann-Gibbs measures, multilevel decompositions 
for rare event excursion models on critical level sets, decreasing subsets strate- 
gies for sampling tail style distributions, and many other sequential importance 
sampling plan. For a more thorough discussion on these models we refer the 
reader to |27| . 

From the pure probabilistic point of view, any flow of probability measures 
can be interpreted as the evolution of the laws of the random states of a Markov 
process. In contrast to conventional Markov chain models, the Markov tran- 
sitions of these chains may depend on the distribution of the current random 
state. The mathematical foundations of these discrete generation models have 
been started in 1996 in [19] in the context of nonlinear filtering problems. Fur- 
ther analysis was developed in a joint work [23] of the first author with L. 
Miclo published in 2000. For a more thorough discussion on the origin and 
the performance analysis of these discrete generation models, we also refer to 
the monograph [221, and the joint articles of the first author with A. Guion- 
net [Ml [371 [Ml EH], and M. Kouritzin [12]. 

The continuous time version of these nonlinear type Markov chain mod- 
els take their origins from the 1960s, with the development of fluid mecha- 
nisms and statistical physics. We refer the reader to the pioneering works of 
H.P. McKean |75l 176] . see also the more recent treatments by N. Bellomo and 
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M. Pulvirenti [31 E], the series of articles by C. Graham and S. Meleard on in- 
teracting jump models |58l [SHI IHOIi the articles by S. Meleard on Boltzmann 
equations |71l [7^ [751 ^^, and the lecture notes of A.S. Sznitman [HDj, and 
references therein. 

In contrast to conventional Markov chain Monte Carlo techniques, these 
McKean type nonlinear Markov chain models can be thought as perfect impor- 
tance sampling strategies, in the sense that the desired target measures coincide 
at any time step with the law of the random states of a Markov chain. Unfor- 
tunately, as we mentioned above, the transitions of these chains depend on the 
distributions of its random states. Thus, they cannot be sampled without an 
additional level of approximation. One natural solution is to use a mean field 
particle interpretation model. These stochastic techniques belong to the class 
of stochastic population models, with free evolutions mechanisms, coupled with 
branching and/or adaptive interacting jumps. At any time step, the occupa- 
tion measure of the population of individuals approximate the solution of the 
nonlinear equation, when the size of the system tends to oo. 

In genetic algorithms and sequential Monte Carlo literature, the reference 
free evolution model is interpreted as a reference sequence of twisted Markov 
chain samplers. These chains are used to perform the mutation/proposal tran- 
sitions. As in conventional Markov chain Monte Carlo methods, the interacting 
jumps are interpreted as an acceptance-rejection transition, equipped with so- 
phisticated interacting and adaptive recycling mechanism. In Bayesian statis- 
tics and engineering sciences, the resulting adaptive particle sampling model is 
often coined as a sequential Monte Carlo algorithm, genetic procedures, or sim- 
ply Sampling Importance Resampling methods, mainly because it is based on 
importance sampling plans and online approximations of a fiow of probability 
measures. 

Since the 1960s, the adaptive particle recycling strategy has also been asso- 
ciated in biology and engineering science with several heuristic-like paradigms, 
with a proliferation of botanical names, depending the application area they are 
thought: bootstrapping, switching, replenishing, pruning, enrichment, cloning, 
reconfigurations, resampling, rejuvenation, acceptance/rejection, spawning. 

Of course, the idea of duplicating online better-fitted individuals and moving 
them one step forward to explore state-space regions is the basis of various 
stochastic search algorithms. To name a few: 

Particle and bootstrap filters, Rao-Blackwell particle filters, sequential Monte 
Carlo methods, sequentially Interacting Markov chain Monte Carlo, genetic type 
search algorithms, Cibbs cloning search techniques, interacting simulated an- 
nealing algorithms, sampling-importance resampling methods, quantum Monte 
Carlo walkers, adaptive population Monte Carlo sampling models, and many 
others evolutionary type Monte Carlo methods. 

For a more detailed discussion on these models, with precise references we 
refer the reader to the three books [531 HSl EH] ■ 

1.3 Feynman-Kac path integrals 

Feynman-Kac measures represent the distribution of the paths of a Markov pro- 
cess, weighted by a collection of potential functions. These functional models are 
natural mathematical extensions of the traditional changes of probability mea- 



RR n° 7677 



On the concentration properties of Interacting particle processes 



sures, commonly used in importance sampling technologies, Bayesian inference, 
and in nonlinear filtering modeling. 

These stochastic models are defined in terms of only two ingredients: 
A Markov chain X„, with Markov transition i\/„ on some measurable state 
spaces (£'„,£'„) with initial distribution 770, and a sequence of (0, l]-valued po- 
tential functions G„ on the set E„. 

The Feynman-Kac path measure associated with the pairs (Mn,Gn) is the 
probability measure Q„ on the product state space 

E„ := (£^0 X . . . X En) 

defined by the following formula 



" [o<p<n J 



(1) 



where -Z„ is a normalizing constant and P„ is the distribution of the random 
paths 

Xn = (^0 I ■ • ■ 7 ^n ) G E„ 

of the Markov process Xp from the origin p = with initial distribution rjQ, up 
to the current time p = n. We also denote by 

r„ = Z„ Q„ (2) 

its unnormalized version. 

The prototype model we have in head is the traditional particle absorbed 
Markov chain model 

absorption ^{1 — Gn) ^^ exploration ^A/^+i 

X^ e E: := En U {c} > XI > X-^^^ (3) 

The chain Xf^ starts at some initial state X^ randomly chosen with distribu- 
tion rjQ. During the absorption stage, we set X^ = X^ with probability G'„(X„), 
otherwise we put the particle in an auxiliary cemetery state X^ = c. When the 
particle X^ is still alive (that is, if we have X^ G En), it performs an elementary 
move Xn ~~^ ^n+i according to the Markov transition M„+i. Otherwise, the 
particle is absorbed and we set X^ ~ X': = c, for any time p > n. 

If we let T be the first time Xn — c, then we have the Feynman-Kac repre- 
sentation formulae 

Q„ ^ Law((Xo^ ...,X^)\T>n) and Z„ = Proba (T > n) 

For a more thorough discussion on the variety of application domains of Feynman- 
Kac models, we refer the reader to chapter [21 

We also denote by rjn and 7„, the n-th time marginal of Q„ and r„. It is a 
simple exercise to check that 

7n == 7n-lQn and rin+l = $„+i(r/„) := '^G„{Vn)Mn+l (4) 

with the positive integral operator 

Qn{x,dy) = G„_i(.t) Mn{x,dy) 
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and the Boltzinann-Gibbs transformation 

'^GAVn){dx) = —-— Gnix)r]nidx) (5) 

In addition, the normalizing constants Zn can be expressed in terms of the flow 
of marginal measures rjp, from the origin p = up to the current time n, with 
the following multiplicative formulae: 



2„:=7n(l)=E n Gp{Xp)\= H rjpiGp) (6) 

\o<p<n J 0<p<n 

This multiplicative formula is easily checked using the induction 

7„+i(l) = 7n(G„) = 77„(G„) 7„(1) 

The abstract formulae discussed above are more general than it may appear. 
For instance, they can be used to analyze without further work path spaces mod- 
els, including historical processes or transition space models, as well as finite ex- 
cursion models. These functional models also encapsulated quenched Feynman- 
Kac models, Brownian type bridges and linear Gaussian Markov chains condi- 
tioned on starting and end points. 

For a more thorough discussion on these path space models, we refer the 
reader to section 2.4, section 2.6, chapters 11-12 in the monograph [5S], as well 
as to the section [51 in the former lecture notes. 

When the Markov transitions M„ are absolutely continuous with respect to 
some measures A„ on En, and for any (a;, y) € {En~i x £"„) we have 

Hn{x,y):= ;; ' iy)>0 (7) 

we also have the following backward formula 

n 
Qn{d{xo,...,Xn)) ^Vn{dXn) Y\_^q,Vq-li^q' dXq-l) (8) 

g=l 

with the the collection of Markov transitions defined by 

M„+i^,,„(a;, dy) ex G'„(y) iJ„+i(y,x) 7/„(dy) (9) 

The proof of this formula is housed in section 13.21 

Before launching into the description of the particle approximation of these 
models, we end this section with some connexions between discrete generation 
Feynman-Kac models and more conventional continuous time models arising in 
physics and scientific computing. 

The Feynman-Kac models presented above play a central role in the numer- 
ical analysis of certain partial differential equations, offering a natural way to 
solve these functional integral models by simulating random paths of stochastic 
processes. These Feynman-Kac models were originally presented by Mark Kac 
in 1949 [SS] for continuous time processes. 

These continuous time models are used in molecular chemistry and com- 
putational physics to calculate the ground state energy of some Hamiltonian 
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operators associated with some potential function V describing the energy of 
a molecular configuration (see for instance [SJ [231 [13 [SOI IS3i and references 
therein). To better connect these partial differential equation models with 
P|), let us assume that Af„(a;„_i, (ix„) is the Markov probability transition 
Xn = Xn -^ Xn+i = Xn+1 coming from a discretization in time X„ = X^ of 
a continuous time iJ-valued Markov process X^ on a given time mesh {tn)n>o 
with a given time step (i„ — i,i-i) = At. For potential functions of the form 
Gn = e"^'^*, the measures Q„ —At^o Qt„ represents the time discretization of 
the following distribution: 

dQt = Y^ cxp (- f V{X'J ds\ rfP^ 



X' 



where Ff stands for the distribution of the random paths (X^)o<s<t with a 
given infinitesimal generator L. The marginal distributions jt at time t of the 
unnormalized measures Zt dQt are the solution of the so-called imaginary time 
Schroedinger equation, given in weak formulation on sufficiently regular function 
/ by the following intregro-differential equation 

^7t(/):=7t(i''(./)) with L^ = L-y 

The errors introduced by the discretization of the time are well understood for 
regular models, we refer the interested reader to [HI [JSl [BHl [75] in the context 
of nonlinear filtering. 

1.4 Interacting particle systems 

The stochastic particle interpretation of the Feynman-Kac measures ([T|) starts 
with a population of N candidate possible solutions {(,o^ ■ ■ ■ ,^o) randomly cho- 
sen w.r.t. some distribution 779. 

The coordinates ^q also called individuals or phenotypes, with 1 < A^. The 
random evolution of the particles is decomposed into two main steps : the free 
exploration and the adaptive selection transition. 

During the updating-selection stage, multiple individuals in the current pop- 
ulation {^l^, . . . , ^,^) at time n G N are stochastically selected based on the fitness 
function G„. In practice, we choose a random proportion _B^ of an existing solu- 
tion S,n in the current population with a mean value ex Gn{S.n) to breed a brand 
new generation of "improved" solutions (^^, . . . , S,^)- For instance, for every in- 
dex i, with a probability enG„(^^J, we set ^^ = £,„, otherwise we replace ^^ with 
a new individual ^^ = ^-{^ randomly chosen from the whole population with a 
probability proportional to Gn{Cn)- The parameter e„ > is a tuning parameter 
that must satisfy the constraint enGn{£,n) < 1, for every 1 < i < A^. During the 
prediction- mutation stage, every selected individual ^^ moves to a new solution 
S,n^i = X randomly chosen in En+i, with a distribution Mn+i{£,n, dx). 

If we interpret the updating-selection transition as a birth and death process, 
then arises the important notion of the ancestral line of a current individual. 
More precisely, when a particle Cn-i — ^ Ci evolves to a new location ^^. we can 
interpret i,'!^_i as the parent of ^^. Looking backwards in time and recalling that 
the particle ^^„i has selected a site £,^^1 in the configuration at time (n— 1), we 
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can interpret this site ^/j_]^ as the parent of ^^_j and therefore as the ancestor 
denoted ^n_i „ at level {n — 1) of ^,,\. Running backwards in time we may trace 
the whole ancestral line 

Most of the terminology we have used is drawn from filtering and genetic 
evolution theories. 

In filtering, the former particle model is dictated by the two steps prediction- 
updating learning equations of the conditional distributions of a signal process, 
given some noisy and partial observations. In this setting, the potential func- 
tions represent the likelihood function of the current observation, while the free 
exploration transitions are related to the Markov transitions of the signal pro- 
cess. 

In biology, the mutation-selection particle model presented above is used 
to mimic genetic evolutions of biological organisms and more generally natural 
evolution processes. For instance, in gene analysis, each population of individ- 
uals represents a chromosome and each individual particle is called a gene. In 
this setting the fitness potential function is usually time-homogeneous and it 
represents the quality and the adaptation potential value of the set of genes in a 
chromosome |61) . These particle algorithms are also used in population analysis 
to model changes in the structure of population in time and in space. 

The different types of particle approximation measures associated with the 
genetic type particle model described above are summarized in the following 
synthetic picture corresponding to the case A^ = 3: 




-*-• = • 



-*-• = • 



In the next 4 sections we give an overview of the 4 particle approximation 
measures can be be extracted from the interacting population evolution model 
described above. We also provide some basic formulation of the concentration 
inequalities that will be treated in greater detail later. As a service to the reader 
we also provide precise pointers to their location within the following chapters. 
We already mention that the proofs of these results are quite subtle. 

The precise form of the constants in these exponential inequalities depends 
on the contraction properties of Feynman-Kac flows. Our stochastic analysis 
requires to combine the stability properties of the nonlinear semigroup of the 
Feynman-Kac distribution fiow 7]„, with deep convergence results of empirical 
processes theory associated with interacting random samples. 

1.4.1 Current population models 

The occupation measures of the current population, represented by the red dots 
in the above figure 

1=1 
RR n° 7677 



On the concentration properties of Interacting particle processes 12 



converge to the n-th time marginals 7y„ of the Feynman-Kac measures Q„. We 
shall measure the performance of these particle estimates through several con- 
centration inequalities, with a special emphasis on uniform inequalities w.r.t. 
the time parameter. Our results will basically be stated as follows. 

1) For any time horizon n > 0, any bounded function /, any A^ > 1, and for 
any a; > 0, the probability of the event 



is greater than 1 — e^^. In the above display, ci stands for a finite constant 
related to the bias of the particle model, while C2 is related to the variance of 
the scheme. The values of Ci and C2 don't depend on the time parameter. 

We already mention one important consequence of these uniform concen- 
tration inequalities for time homogeneous Feynman-Kac models. Under some 
regularity conditions, the flow of measures rjn tends to some fixed point distri- 
bution ?7ooi in the sense that 

||?M-?/oo||tv<C3 e-^" (11) 

for some finite positive constants C3 and S. The connexions between these limit- 
ing measures and the top of the spectrum of Schrodinger operators is discussed 
in section 12.7.11 We also refer the reader to section 12.7.21 for a discussion 
on these quasi-invariant measures and Yaglom limits. Quantitative contraction 
theorems for Feynman-Kac semigroups are developed in the section [3.4.21 As 
a direct consequence of the above inequalities, we find that for any a; > 0, the 
probability of the following events is is greater than 1 — e~^ 



2) For any x = {xi)i<:i<:d € En = M'', we set (— cx),a;] — Y[i=i{~'^^T^i] ^nd 
we consider the repartition functions 

-Fn(a;) = 7?„ (l(_oo,£c]) and F^(a;) = 77,^ (l(_oo.x]) 
The probability of the following event 

Vn ||F,f - F„|| < c ^d {x + 1) 

is greater than 1 — e^^, for any a; > 0, for some universal constant c < 00 that 
doesn't depend on the dimension, nor on the time parameter. Furthermore, 
under the stability properties ((TT|) , if we set 

Foo{x) =??oo (l(-oo,a;]) 

then, the probability of the following event 

||^,f - i^ooll < ^ Vd {x + l) + C3 e-*'" 



is greater than 1 — e ^, for any x > 0, for some universal constant c < 00 that 
doesn't depend on the dimension. 
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For more precise statements, we refer the reader to corollarv l6.11 and respec- 
tively to corollary 16. 51 

The concentration properties of the particle measures rj^ around their lim- 
iting values are developed in chapter [HI In section 16.31 we design a stochastic 
perturbation analysis that allows to enter the stability properties of the limiting 
Feynman-Kac semigroup. Finite marginal models are discussed in section r6. 4. II 
Section [6.4.2l is concerned with the concentration inequalities of interacting par- 
ticle processes w.r.t. some collection of functions. 

1.4.2 Particle free energy models 

Mimicking the multiplicative formula ([5]), we set and 

2^= n <(Gp) ^'^d 7,^(^2;) =Z„^ X, 7^ (dx) (12) 

0<p<n 

We already mention that these rather complex particle models provide an 
unbiased estimate of the unnormalized measures. That is, we have that 

^iv^Hfn) Y[ v^iGpU^Eiux,,) Y[ G,{Xp)] (13) 

y 0<p<ri J y 0<p<Ti J 

The concentration properties of the unbiased particle free energies Z^ around 
their limiting values Z„ are developed in section [6.51 Our results will basically 
be stated as follows. 

For any A^ > 1, and any eG{+l,— 1}, the probability of each of the following 
events 

, -p-N 

-log9^<| {1 + x + V^+^V^ 
n Zn N ^ y/N 

is greater than 1 — e~^ . In the above display, c\ stands for a finite constant 
related to the bias of the particle model, while C2 is related to the variance of the 
scheme. Here again, the values of ci and C2 don't depend on the time parameter. 
A more precise statement is provided in corollarv 16.71 

1.4.3 Genealogical tree model 

The occupation measure of the A^-genealogical tree model represented by the 
lines linking the blue dots converges as TV — > oo to the distribution Q„ 

1 ^ 
lim — V S(f^ « f )= Qn (14) 

i=l 

Our concentration inequalities will basically be stated as follows. A more precise 
statement is provided in corollarv 16. 21 

For any n > 0, any bounded function £„ on the path space £„, s.t. j|f„|| < 1, 
and any A^ > 1, the probability of each of the following events 



N Z-^i=l "^nlsCru Sl,ni • ■ • J ?n,n) yn(ln) 



n + 1 
< Cl 

N 
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is greater than 1 — e^^. In the above display, ci stands for a finite constant 
related to the bias of the particle model, while C2 is related to the variance of the 
scheme. Here again, the values of ci and C2 don't depend on the time parameter. 

The concentration properties of genealogical tree occupation measures can 
be derived more or less directly from the ones of the current population models. 
This rather surprising assertion comes from the fact that the 71-th time marginal 
rjn of a Feynman-Kac measure associated with a reference historical Markov 
process has the same form as in the measure ((T]). This equivalence principle 
between Q„ and the marginal measures are developed in section 13.21 dedicated 
to in historical Feynman-Kac models. 

Using these properties, we prove concentration properties for interacting 
empirical processes associated with genealogical tree models. Our concentration 
inequalities will basically be stated as follows. A more precise statement is 
provided in section l6.4.2l We let J>j be the set of product functions of indicator 
of cehs in the path space En = (M* x . . . , xR'*"), for some dp> 1, p> 0. We 
also denote by rj^ the occupation measure of the genealogical tree model. In this 
notation, the probability of the following event 



sup 77,^:(f„)-Q„(fn) <c(n + l) 



'E 



0<p<n "'P 



N 



(x + 1) 



is greater than 1 — e ^, for any a; > 0, for some universal constant c < 00 that 
doesn't depend on the dimension. 



1.4.4 Complete genealogical tree models 

Mimicking the backward model (jS]) and the above formulae, we set 



r 



N 



z5 X ofT 



(15) 



with 

n 

Q'^{d{xo,...,Xn)) =T]^{dxn) ]J M^ ,jN_^ (a;,, da;,_i) 

q=l 

Notice that the computation of sums w.r.t. these particle measures are 
reduced to summations over the particles locations ^^. It is therefore natural 
to identify a population of individual (^^, . . . ,(,^) at time n to the ordered set 
of indexes {1, . . . , A''}. In this case, the occupation measures and the functions 
are identified with the following line and column vectors 



,w 



1 

iV' 



1 

N 



and f„ :~ 




and the matrices 



1I„ „N by the (N x N) matrices 



^,Vn — l 






In.ri" , (^n^Ci-l) 






(16) 
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with the (i, j)-entries 



For instance, the Q„-integration of normalized additive Unear functionals of the 
form 

f„(a;o,...,a;„) = — Y ^ fp{xp) (17) 

0<p<n 

is given the particle matrix approximation model 



(fn) = ^ E €^n,,-_,^2,,-^p+l,-,^{fp) 



These type of additive functionals arise in the calculation of the sensitivity 
measures discussed in section r2.4.1l 

The concentration properties of the particle measures Q^ around the Feynman- 
Kac measures Q„ are developed in section 15^ Special emphasis is given to the 
additive functional models ([T7|) . In section IB. 6. 31 we extend the stochastic per- 
turbation methodology developed in section 16.31 for time marginal model to the 
particle backward Markov chain associated with the random stochastic matri- 
ces (fT6)) . This technique allows to enter not only the stability properties of the 
limiting Feynman-Kac semigroup, but also the ones of the particle backward 
Markov chain model. 

Our concentration inequalities will basically be stated as follows. A more 
precise statement is provided in corollary 16.91 and in corollary 16.111 

For any n > 0, any normalized additive functional of the form (|17p . with 
maxo<p<n ll/pll < 1, and any A'^ > 1, the probability of each of the following 
events 

[Q^ - Q„] (fn) < cr 1 (1 + (X + V5^)) + c, y^l^ 

is greater than 1 — e~^ . In the above display, ci stands for a finite constant 
related to the bias of the particle model, while C2 is related to the variance of the 
scheme. Here again, the values of ci and C2 don't depend on the time parameter. 
For any a — (ai)\<i<d G E^ = K'^, we denote by Ca the cell 



Ca :== (-oo,a] = JJ(-oo,ai] 



and fan the additive functional 



Ia,n 1,^0; ■ • ■ 5 ^n) -. / ^ -L(— oo,a] y^v) 



0<p<n 

The probability of the following event 



sup |Q,^(fa.„) - Q„(fa.„)| < C sl^ix + 1) 

is greater than 1 — e~^, for any a; > 0, for some constant c < oo that doesn't 
depend on the dimension, nor on the time horizon. 
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Remark 1.1 One way to turn all of these inequalities in term of Bernstein 
style concentration inequalities is as follows. For any exponential inequality of 
the form 

Va; > Pfx <ax + V2b^ + c) < 1 - e"'' 



for some non negative constants {a,b,c), we also have 

Vy>0 P(X < y + c) < 1-cxp (-; ^ 



2 (6 + ay) 
A proof of this result is provided in lemma \J7d\ 

1.5 Basic notation 

This section provides some background from stochastic analysis and integral 
operator theory we require for our proofs. Most of the results with detailed 
proofs can be located in the book |25| , on Feynman-Kac formulae and interacting 
particle methods. Our proofs also contain cross-references to this well rather 
known material, so the reader may wish to skip this section and enter directly to 
the chapter [2] dedicated to some application domains of Feynman-Kac models. 

1.5.1 Integral operators 

We denote respectively by M{E), Mq{E), V{E), and B{E), the set of all finite 
signed measures on some measurable space {E,£), the convex subset of measures 
with null mass, the set of all probability measures, and the Banach space of all 
bounded and measurable functions / equipped with the uniform norm ||/||. We 
also denote by Osci{E), and by Bi{E) the set off-measurable functions / with 
oscillations osc(/) < 1, and respectively with ||/j| < 1. We let 

m(/) = J Kdx) fix) 

be the Lebesgue integral of a function / e 13{E), with respect to a measure 
fj.eM{E). 

We recall that the total variation distance on A4{E) is defined for any /i e 

MiE) by 

ll^lltv^TT sup {fi{A) - fl{B)) 

We recall that a bounded integral operator M from a measurable space {E, £) 
into an auxiliary measurable space {F^F) is an operator / i-^. M{f) from B{F) 
into B{E) such that the functions 



M{f){x):^ / M{x,dy)f{y) 

J F 

are f-measurable and bounded, for any / g B{F). A Markov kernel is a positive 
and bounded integral operator M with M(l) = 1. Given a pair of bounded 
integral operators (Mi,M2), we let (M1M2) the composition operator defined 
by (MiM2)(/) = Mi(M2(/)). For time homogeneous state spaces, we denote by 
M™ = M"^~^M = MM"^~^ the m-th composition of a given bounded integral 
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operator A/, with m > I. A bounded integral operator M from a measurable 
space {E,£) into an auxiliary measurable space {F,J^) also generates a dual 
operator 

fi{dx) H^ {fiM){dx) = / ^(dy)M{y,dx) 



from M(E) into 7W(F) defined by (pM){f) := fi{M{f)). We also used the 
notation 

A' ([/ - K{f)]') ix) := X ([/ - Kif){x)f) (x) 

for some bounded integral operator K and some bounded function /. 

We prefer to avoid unnecessary abstraction and technical assumptions, so 
we frame the standing assumption that all the test functions are in the unit 
sphere, and the integral operators, and all the random variables are sufficiently 
regular that we are justified in computing integral transport equations, regular 
versions of conditional expectations, and so forth. 

1.5.2 Contraction coefficients 

When the bounded integral operator M has a constant mass, that is, when 
Af(l) (x) — il/(l) (y) for any {x,y) € E^, the operator fi i-J- fiM maps Mo{E) 
into Aio{F)- In this situation, we let P{M) be the Dobrushin coefficient of a 
bounded integral operator M defined by the formula 

/3(M) := sup {osc(A/(/)) ; / e Osc(F)} 

Notice that fi{M) is the operator norm of M on M.q{E), and we have the 
equivalent formulations 

/?(M) = sup{|!M(x,.)-M(y,.)|ltv; (x,2/)ei?2} 

= sup ||/.iA/j|tv/||A*lltv 
^leMo(E) 

A detailed proof of these well known formulae can be found in [5S] . 

Given a positive and bounded potential function G on E. we also denote 
by ^c the Boltzmann-Gibbs mapping from 'P{E) into itself defined for any 
^x e V{E) by 

'^G{^i){dx) = -^ G{x) ^l{dx) 

For ]0, 1] -valued potential functions, we also mention that ^g(m) can be ex- 
pressed as a non linear Markov transport equation 

*g(m) = M^M.G (18) 

with the Markov transitions 

S^,g{^, dy) = G{x) S^idy) + (1 - G{x)) ^G{l^){dy) 
We notice that 

*g(/^) - *g(^^) = (a* - ^)^M + HS^ - S,) 

and 

i^iS^ - S.) = {l~ v[G)) [*G(/i) ' *gM] 
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from which we find the formula 



1 



In addition, using the fact that 

V(x, A) e {E,£) S^{x,A) > (1 - ||G||) *g(a*)(A) 
we prove that /3(5'^) < ||G|| and 

||vi/G(/i) - vi/GMIItv < ''^" ,^, !1a* - Hltv 

^(G) V i^[G) 

If we set <I'(/^) = 4'G'(/i)A/, for some Markov transition A/, then we have the 

decomposition 

1 

for any couple of measures v, fi on E. From the previous discussion, we also find 
the following Lipschitz estimates 



^^i)-^u) = -J^ i^l~ly)S^M (19) 



||<i>(M) - *Mlltv < ^JI^\q) PiM) IIm - ^lltv (20) 

We end this section with an interesting contraction property of a Markov tran- 
sition 

Ma{:>^.dy) = ''[^'^g^g^"^ ^ ^a{5.M){dy) (21) 

associated with a ]0, l]-valued potential function G, with 

.g = supG(x)/G(y) <oo (22) 

a;, a 

It is easily checked that 

\MG{f){x) ~ MG{f){y)\ = |*G('5.Af)(/) - vl/G(^yM)(/)| 

< g\\5,M~5yM\\t. 

from which we conclude that 

/3(A/g)<.9/3(M) (23) 

1.5.3 Orlicz norms and Gaussian moments 

We let TT^ \Y] be the Orlicz norm of an M- valued random variable Y associated 
with the convex function V'('") = e" — 1, and defined by 

^^(y)=inf{ae(0,oo) : E(V(|F|/a)) < 1} 

with the convention inf0 = oo. Notice that 

MY)<c^^^{^j{Ylc))<l 
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For instance, the Orlicz norm of a Gaussian and centred random variable U, 
s.t. E{U'^) — 1, is given by Ti^iU) = ^8/3. We also recall that 

E (t/^™) = 6(2m)2" := (2m),„ 2"™ 

^m + 1/2 

(24) 

with [q + p)p := ((/ +p)!/(7!. The second assertion comes from the fact that 

]£(^2rn+l)2 ^ E (C/^'") E (t/^^^ + l)) 

and therefore 

6(2m + 1)2(2™+!) ^ E((72") e(;72(™+i)) 

= 2-(2™+i) (2m),„ (2(to+1))(,„+i) 

This formula is a direct consequence of the following decompositions 

, , XX (2(m+l))! (2m + 1)! 

(2(m + !))(„+!) = ^^b^^T)r = 2 ^! = 2 (2m + l)(„+i) 

and 

, , 1 (2?7i+l)! 1 

2to)„ = — - 1 = — (2m + l)(,„+i) 

2m + 1 m! 2m + 1 ' 

We also mention that 

6(m) < 5(2m) (25) 

Indeed, for even numbers m = 2p we have 

6(m)2™ = 6(2p)4P = E([/2P)2 < E(J7''P) = h{ApfP = 6(2m)2™ 
and for odd numbers m = {2p + 1), we have 

6(m)2" = 6(2p + l)2(2p+i) ^ E (C/2P) E (c/2(P+i)) 

/ (2p+i)\ 2Fn" / / s i^£i^ 

< KUU^p)^] E (t/2(P+i)) "^^ 

= E (u^(^p+^A = b{2{2p + l))2(2p+i) ^ 6(2m)2'" 

2 Some application domains 

2.1 Introduction 

Feynman-Kac particle methods are also termed quantum Monte Carlo methods 
in computational physics, genetic algorithms in computer sciences, and particle 
filters and-or sequential Monte Carlo methods in information theory, as well as 
in bayesian statistics. 
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The mathematical foundations of these advanced interacting Monte Carlo 
methodologies are now fifteen years old jl9) . Since this period, so many descrip- 
tions and variants of these models have been published in applied probability, 
signal processing and Bayesian statistic literature. For a detailed discussion on 
their application domains with a precise bibliography of who first did that when, 
we refer the reader to any of the following references |23l ^S[ , and [261 H5] . 

In the present section, we merely content ourselves in illustrating the rather 
abstract models ([T]) with the Feynman-Kac representations of 20 more or less 
well known conditional distributions, including three more recent applications 
related to island particle models, functional kinetic parameter derivatives, and 
gradient analysis of Markov semigroups. 

The forthcoming series of examples, combined with their mean field particle 
interpretation models described in section 11.41 also illustrate the ability of the 
Feynman-Kac particle methodology to solve complex conditional distribution 
flows as well as their normalizing constants. 

Of course, this selected list of applications does not attempt to be exhaustive. 
The topics selection is largely influenced by the personal taste of the authors. 
A complete description on how particle methods are applied in each application 
model area would of course require separate volumes, with precise computer 
simulations and comparisons with different types of particle models and other 
existing algorithms. 

We also limit ourselves to describing the key ideas in a simple way. often 
sacrificing generality. Some applications are nowadays routine, and in this case 
we provide precise pointers to existing more application-related articles in the 
literature. Reader who wishes to know more about some specific application of 
these particle algorithms is invited to consult the referenced papers. 

One natural path of "easy reading" will probably be to choose a familiar 
or attractive application area and to explore some selected parts of the lecture 
notes in terms of this choice. Nevertheless, this advice must not be taken too 
literally. To see the impact of particle methods, it is essential to understand 
the full force of Feynman-Kac modeling techniques on various research domains. 
Upon doing so, the reader will have a powerful weapon for the discovery of new 
particle interpretation models. The principal challenge is to understand the 
theory well enough to reduce them to practice. 

2.2 Boltzmann-Gibbs measures 

2.2.1 Interacting Markov chain Monte Carlo methods 

Suppose we are given a sequence of target probability measures on some mea- 
surable state space E of the following form 

f^nidx) = ^ < n '^p(^) ( ^('^^) (26) 

" [o<p<n J 

with some sequence of bounded nonnegative potential functions 

/i„ : X e E 1-^ hn{x) e {0,oo) 

and some reference probability measure A on E. In the above displayed formula, 
Zn stands for a normalizing constant. We use the convention Yla = 1 ^nd 
fio = X. 
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We further assume that we have a dedicated Markov chain Monte Carlo 
transition /i„ = /x„M„ with prescribed target measures fi„, at any time step. 
Using the fact that 

/i„+i = *,j„(/i„) 

with the Bohzmann-Gibbs transformations defined in ([S]), we prove that 

^„+l = /^„+iM„+i = ^;j„(/Lt„)M„+i 

from which we conclude that 

/.„(/) =E I /(X„) n hp{Xp)\ /e( n hp{Xp) 

y 0<p<ri J \o<p<n 

with the reference Markov chain 

P(X„ e dx I X„_i) = Af„(X„_i,da:) 
In addition, we have 



--n+l 



n '^p(^) 

0<p<ri 



0<P<n 



We illustrate these rather abstract models with two applications related 
respectively to probability restriction models and stochastic optimization simu- 
lated annealing type models. For a more thorough discussion on these interact- 
ing MCMC models, and related sequential Monte Carlo methods, we refer the 
reader to [2511271. 



2.2.2 Probability restrictions 

If we choose Markov chain Monte Carlo type local moves 

fJ-n = fJ-nMn 

with some prescribed target Boltzmann-Gibbs measures 

^n{dx) ex \a„{x) X{dx) 

associated with a sequence of decreasing subsets A„ ^, and some reference mea- 
sure A, then we find that fXn = Tjn and -Z„ = A(A„), as soon as the potential 
functions in ((T]) and ([25]) are chosen so that 

This stochastic model arise in several application domains. In computer 
science literature, the corresponding particle approximation models are some- 
times called subset methods, sequential sampling plans, randomized algorithms, 
or level splitting algorithms. They were used to solve complex NP-hard com- 
binatorial counting problems [2] , extreme quantile probabilities [T51 [SD] , and 
uncertainty propagations in numerical codes |10| . 
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2.2.3 Stochastic optimization 

If we choose Markov chain Monte Carlo type local moves ^„ = UnMn with some 
prescribed target Boltzmann-Gibbs measures 

fin{dx) oc e-''"^^^) \{dx) 

associated with a sequence of increasing inverse temperature parameters /3„ f, 
and some reference measure A, then we find that /i„ = rjn and Zn = X{e^^"^) 
as soon as the potential functions in ([T]) and ([25)) are chosen so that 

G -h - e-('^"+i-'3")^ 

For instance, we can assume that the Markov transition M„ = A^™^ is the 
r?T,„-iterate of the following Metropolis Hasting transitions 

Mn,i3„{x,dy) 

= Knix, dy) min (l, e-/5"(^('')-^(^^))) 

+ (l-/^X„(a;,dz) min(l,e-^"(^(^)-^("»)) 5^{dv) 

We finish this section with assorted collection of enriching comments on 
interacting Markov chain Monte Carlo algorithms associated with the Feynman- 
Kac models described above. 

Conventional Markov chain Monte Carlo methods (abbreviated MCMC meth- 
ods) with time varying target measures ^„ can be seen as a single particle 
model with only mutation explorations according to the Markov transitions 
AIn = K^^ , where if^" stands for the iteration of an MCMC transition Kn s.t. 
Mri = fJ"nKn- In this situation, we choose a judicious increasing sequence m„ so 
that the non homogeneous Markov chain is sufficiently stable, even if the target 
measures become more and more complex to sample. When the target measure 
is fixed, say of the form fiT for some large T, the MCMC sampler again uses 
a single particle with behave as a Markov chain with time homogenous transi- 
tions AIt- The obvious drawback with these two conventional MCMC samplers 
is that the user does not know how many steps are really needed to be close 
to the equilibrium target measure. A wrong choice will return samples with a 
distribution far from the desired target measure. 

Interacting MCMC methods run a population of MCMC samplers that in- 
teract one each other through a recycling-updating mechanism so that the occu- 
pation measure of the current measure converge to the target measure, when we 
increase the population sizes. In contrast with conventional MCMC methods, 
there are no burn-in time questions, nor any quantitative analysis to estimate 
the convergence to equilibrium of the MCMC chain.. 

2.2.4 Island particle models 

In this section, we provide a brief discussion on interacting colonies and island 
particle models arising in mathematical biology and evolutionary computing lit- 
erature [Ml [57] . The evolution of these stochastic island models is again defined 
in terms of a free evolution and a selection transition. During the free evolu- 
tion each island evolves separately as a single mean field particle model with 
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mutation-selection mechanism between the individual in the island population. 
The selection pressure between islands is related to the average fitness of the 
individuals in island population. A colony with poor fitness is killed, and re- 
placed by brand new generations of "improved" individuals coming from better 
fitted islands. 

For any measurable function /„ on £"„, we set 

X(0) = X„ G i?(°) := Er. and /1°) = /„ 

and we denote by 

V " Jl<^<N^ " V " J 

the iVi-particle model associated with the reference Markov chain Xn , and the 
potential function G„ . 

To get one step further, we denote by /A the empirical mean valued function 
on En defined by 



fi'\x(p)=^f:fi'Hxi'''^) 



2 — 1 



In this notation, the potential value of the random state Xn is given by the 
formula 

By construction, we have the almost sure property 

iVi = l=^xW=xW and GW(xW)=G(°)(X(0)) 

More interestingly, by the unbiased properties (|13p we have for any population 
size Ni 

y 0<p<n J y 0<p<ri 

Iterating this construction, we let 

N2 






(1) 



the iV2-particle model associated with the reference Markov chain Xn , and the 

potential function G„ . For any function /„ on £"„ , we denote by /„ the 

(2) 
empirical mean valued function on En defined by 



f,?\xi,'')=^i:f^''(^^''') 



i=l 



(2) 

In this notation, the potential value of the random state Xn is given by the 
formula 

N 



G'^\xi'^):=^J2Gi'\X^?-^) 



N2 
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and for any for any population size N2 

y 0<p<n J y 0<p<n 

2.2.5 Particle Markov chain Monte Carlo methods 

In this section, we present an interacting particle version of the particle Markov 
chain Monte Carlo method developed in the recent seminal article by C. Andrieu, 
A. Doucet, and R. Holenstein [5]. 

We consider a collection of Markov transition and positive potential functions 
{Mg^n,Gg^n) that depend on some random variable = 0, with distribution v 
on some state space S. We let 779 „ be the n-time marginal of the Feynman-Kac 
measures defined as in ([T]), by replacing {Mn,Gn) by (Me^„, Ge,n)- We also 
consider the probability distribution P{0, d£,) of the A^-particle model 

C := (Ce,o,Ce,i,- ■ • ,^d,T) 

on the interval [0,r], with mutation transitions Mg^n, and potential selection 
functions Gg^n-, with n < T. We fix a large time horizon T, and for any < n < 
T, we set 

l^n{d{^,0)) = ^ I n ^P(^'^)M(^(^'^)) (27) 

" [o<p<n J 

with some sequence of bounded nonnegative potential functions 
hn : (e,^)e n < x5K^/i„(C,0)e(O,oo) 

\0<P<T J 

the reference measure A given by 

and some normalizing constants Zn- Firstly, we observe that these target mea- 
sures have the same form as in (p6)) . Thus, they can be sampled using the 
Interacting Markov chain Monte Carlo methodology presented in section [2.2.1l 
Now, we examine the situation where hp is given by the empirical mean 
value of the potential function Ge,p w.r.t. the occupation measures r]^ of the 

A^-particle model £^g_p = [Q n) associated with the realization = 9; 

more formally, we have that 

^(^'^) = ^ E GgAa,p)=V^,p(Ge,p) 

l<i<N 

Using the unbiased property of the particle free energy models presented in (|13p , 
we clearly have 

fpie,dO I n '^p(e'^)[ =^(11 <p(Gg,p) 

[o<p<n J \0<p<n y 

= n '>leAGe,p) 

0<p<n 
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from which we conclude that the 0-margiiial of /i„ is given by the fohowing 
equation 



" [o<p<n 



" [o<p<n J 

We end this section with some comments on these distributions. 

As the initiated reader may have certainly noticed, the marginal analysis 
derived above coincides with one developed in section [2.2.41 dedicated to island 
particle models. 

We also mention that the measures /i„ introduced in (|27p can be approxi- 
mated using the interacting Markov chain Monte Carlo methodology presented 
in section 12.2.11 or the particle MCMC methods introduced in the article [5] . 

Last but not least, we observe that 

n VeAGe,p)=El J] Gg,p{Xe,p)\ ^ Z„{e) 

0<p<n yO<p<n J 

where Xe_„ stand for the Markov chain with transitions Me_„, and initial dis- 
tribution T]0fl. In the r.h.s. of the above displayed formulae, Zn{0) stands for 
the normalizing constant of the Feynman-Kac measures defined as in ([1]), by 
replacing {Mn,Gn) by {Me^n,Ge,n)- This shows that 

{fi,-,oQ-^){de) = ^ z,,{e),y{de) 

The goal of some stochastic optimization problem is to extract the parameter Q 
that minimizes some mean value functional of the form 

yo<p<n 

For convex functionals, we can use gradient type techniques using the Back- 
ward Feynman-Kac derivative interpretation models developed in section [2.4. II 
(see also the three joint articles of the first author with A. Doucet, and S. S. 

Singh [311 EH ESI)- 

When V is the uniform measure over some compact set S', an alternative 
approach is to estimate the measures (|27p by some empirical measure 



\<i<N \ \0<P<n / / 



and to select the sampled state 



^O'" ) l<j<N ' V^l'" ) l<j<N ' • • ■ ' l^"'" ) l<j<N ' ' " 
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that maximizes the empirical objective functional 

0<p<Ti 'i-<j<N 

2.2.6 Markov bridges and chains with fixed terminal value 

In many applications, it is important to sample paths of Markov chains with 
prescribed fixed terminal conditions. 

When the left end starting point is distributed w.r.t. to a given regular prob- 
ability measure tt, we can use the time reversal Feynman-Kac formula presented 
by the first author and A. Doucet in [30] ■ More precisely, for time homogeneous 
models {Gn,Mn) = {G,M) in transition spaces, if we consider the Metropolis- 
Hasting ratio 

Tr{dx2)K{x2,dxi) 

G[xi,X2) = , , -^—- 

'K(dxi)M[xi, ax2) 

then we find that 

Q„ = Lawf ((Xo, . . . , X„) I X„ = xn) 

where Law^ stands for the distribution of the Markov chain starting with an 
initial condition tt and evolving according some Markov transition K . The proof 
of these formulas are rather technical, we refer the reader the article [3Djn and 
to the monograph |25j . 

For initial and terminal fixed end-points, we need to consider the paths 
distribution of Markov bridges. As we mentioned in the introduction, on pagej^l 
these Markov bridges are particular instances of the reference Markov chains of 
the abstract Feynman-Kac model ([T|). Depending on the choice of the potential 
functions in ([T]), these Markov bridge models can be associated with several 
application domains, including filtering problems or rare event analysis of bridge 
processes. 

We assume that the elementary Markov transitions M„ of the chain Xn 
satisfy the regularity condition ([7]) for some density functions i7„ and some 
reference measure A„. In this situation, the semigroup Markov transitions 
Mp^n+i = Mp+iMp+2 ■ ■ ■ M„+i are absolutely continuous with respect to the 
measure Xn+i, for any < p < n, and we have 

Mp^n+l{Xp,dx„ + l) — Hp^n+l{Xp,Xn+l) Xn+l{dXn+l) 

with the density function 

Hp,n+l{Xp,Xn+l) — Mp^n {Hn+l{- , Xn+l)) {Xp) 

Thanks to these regularity conditions, we readily check that the paths distri- 
bution of Markov bridge starting at xq and ending at a;„+i at the final time 
horizon (n + 1) are given by 

®(0,2;o),(ri+l,x„ + i) id{xi, . . . ,Xn)) 

:= P((Xi,...,X„) e d(.Ti,...,X„) I Xo = Xo, Xn+1 = Xn+i) 

RR n° 7677 Mpix^^udxp) gp,„+i(a:p,a;„+i) 
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Using some abusive Bayesian notation, we can rewrite these formula as fol- 
lows 

p((xi,...,x„) I (a;o,a;„+i)) 

_ p{x„+i\x^) , I . pix„+i\xp) , I ■> 

■ ■ ■ p{x„+i\xo) Ha^iFoJ 
with 

(iMp(xp_i, .), 



dXp 



-{xp) = p{xp\xp^i) and Hp^n+iixp, x„+i) ^ p{x„+i\xp) 



For linear-Gaussian models, the Markov bridge transitions 

Mp{Xp-i,dXp) Hp^n+l{Xp,Xn+l) _ p(x„+i|Xp) 



Mp(iJp,„+i(.,a;„+i)) (xp-i) p(x„+i|xp_i) 



p{xp\xp-i)\p{dxp) 



can be explicitly computed using the traditional regression formula, or equiva- 
lently the updating step of the Kalman filter. 

2.3 Rare event analysis 

2.3.1 Importance sampling and twisted measures 

Computing the probability of some events of the form {V^(X„) > a}, for some 
energy like function Vn and some threshold a is often performed using the im- 
portance sampling distribution of the state variable X„ with some multiplicative 
Boltzmann weight function e"^"^'^"' associated with some temperature param- 
eter p. These twisted measures can be described by a Feynman-Kac model in 
transition space by setting 

For instance, it is easily checked that 

= E[f„(X„) H Gp(Xp) 

\ 0<p<?i 

with 

X„ = (X„,X„+i) and G„(X„) = e7"+i(^"+i)-^"(^") 



and the test function 



/„(X„) = ly„(x„)>a e-^"(^") 



In the same vein, we have 

E(^„(Xo,...,X„) I Vn{X„)>a) 

= E (i^„.^„(Xo, . . . ,X„) e^"(^")) /E (^„,i(Xo, . . . ,X„) e^"(^")) 
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with the function 

We illustrate these rather abstract formula with a Feynman-Kac formulation of 
European style call options with exercise price a at time n. The prices of these 
financial contracts are given by formulae of the following form 

E{{Vn{Xn)-a)+) 

= E((K(^„)-a) lv„(x„)>a) 

= ¥{Vn{Xn)>a)xE{{VniX„)~a) I VniXn)>a) 

It is now a simple exercise to check that these formulae fit with the Feynman-Kac 
importance sampling model discussed above. Further details on these models, 
including applications in fiber optics communication and financial risk analysis 
can also be found in the couple of articles [MJ |35] and in the article [B] . 

2.3.2 Rare event excursion models 

If we consider Markov excursion type models between a sequence of decreasing 
subsets An or hitting a absorbing level B, then choosing an indicator potential 
function that detects if the n-th excursion hits A„ before B we find that 

Q„ = Law( X hits An \ X hits A„ before B) 

and 

P(X hits ^„ before B) = E [| U,(XtJ =E [| Gp{Xp) 

\0<p<n J \0<p<n 

with the random times 

T„ := inf {p > T„„i : Xp € (A„ U B)} 
and the excursion models 

X„ = iXp)p(=[T„^T„ + i] & G„(X„) = 1a„^i{Xt^_^^) 

In this notation, it is also easily checked that 

E (f„(X[o,T„+i]) I X hits An before B) = Q„(f„) 

For a more thorough discussion on these excursion particle model, we refer the 
reader to the series of articles [TTl [HI [HI US E2 • 

2.4 Sensitivity measures 

2.4.1 Kinetic sensitivity measures 

We let 6' € M** be some parameter that may represent some kinetic type pa- 
rameters related free evolution model or to the adaptive potential functions. 
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We assume that the free evolution model Xf, associated to some value of the 
parameter 0, is given by a one-step probability transition of the form 

MJf\x,dx') := P (xf ^ G dx'\xl'^\ = ^) " HJf\x,x') Xkidx') 
for some positive density functions Hj^ and some reference measures Afc. We 

(0) _v'^' 

also consider a collection of functions G\ = e ^^ that depend on 9. We also 
assume that the gradient and the Hessian of the logarithms of these functions 
w.r.t. the parameter are well defined. We let F^ be the Feynman-Kac measure 
associated with a given value of 9 defined for any function fn on the path space 
E„by 

r:(f„)^Ek(xr,...,X('") n Gf (XW)] (28) 

y 0<p<n J 

We denote by r„ the A^-particle approximation measures associated with a 
given value of the parameter 9 and defined in p3)) . 

By using simple derivation calculations, we prove that the first order deriva- 
tive of the option value w.r.t. 9 is given by 

v^r(f)(f„) = r(f)[f„(VL(f))'(VLW) + f„v^L(f 

with 

and the additive functional 

n 

LW(xo, . . . ,x„) :- ^log (G^'Ji(a;p_i)ijW(xp_i,Xp) 
These quantities are approximated by the unbiased particle models 

v?,r(«)(f„) = r(^^^)[f„(vL(f))'(VL(f)) + f„v^L(f) 

For a more thorough discussion on these Backward Feynman-Kac models, we 
refer the reader to the three joint articles of the first author with A. Doucet, 
andS. S. Singh [111133133]. 

2.4.2 Gradient estimation of Markov semigroup 

We assume that the underlying stochastic evolution is given by an iterated M'^- 
valued random process given by the following equation 

Xn+i := F„(X„) = (F„ o f„_i o . . . o Fo){Xo) (29) 

starting at some random state Xo, with a sequence of random smooth functions 
of the form 

F^{x) ^ F^{x,Wn) (30) 
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with some smooth coUection of functions 

J"„ : (z, w) e »''+''' H^ I'nix, w) e W^ 

and some cohection of independent, and independent of s, random variables Wn 
taking values in some 'R'^ , with d' > 1. The semigroup of the Markov chain X„ 
is the expectation operator defined for any regular function /„ and any state x 

by 

Pn+i{fn+i){x) :-E(/„+i(X„+i) I Xo = :r) =E(/(X„+i(x))) 

with the random flows {Xn{x))^yQ defined for any n > by the following 
equation 

Xn+l{x) ^ F„{Xnix)) 

with the initial condition Xq{x) = x. 

By construction, for any 1 < i,j < d and any x G W^ we have the first 
variational equation 

l<fc<d 

This clearly implies that 



\l<i<d 



(x) (32) 



We denote by Vn = (K )i<ij<d and A„ = iAi!;'^')i<ij<d the random 
(d X d) matrices with the i-th line and j-th column entries 

fiai 



and 



4^'^'H^) = |5(^) = ^%^(^) :=^F(x,W-„) 



In this notation, the equation pip can be rewritten in terms of the following 
random matrix formulae 

n 

Vn+i{x) = AniX,-,{x))Vn{x):=l[Ap{Xp{x)) (33) 

p=0 

with a product On=o ^p '-'^ noncommutative random elements Ap taken in the 
order An, An-i,. . . , Aq. In the same way, the equation (j32p can be rewritten as 

VP„+i(/„+i)(x) = E(V/„+i(X„+i) K+i I ^0 = a;) (34) 

with 

Vn+i := II ApiXp) 

0<p<n 

We equip the space R'^ with some norm || .||. We assume that for any state 
Uo in the unit sphere S'^^ ^ , we have 

IlK+i UoW > 
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In this situation, we have the multiphcative formulae 

v/„+i(x„+i) K+i Uo = [v/„+i(x„+i) Un+i] n WMXp) UpW 

0<p<n 

with the weU defined 5''" ^-valued Markov chain defined by 

Vn+l Uo 



U„+i = A„{Xn)Un/\\A„iX„)Un\\ ^ U, 



Jn+1 



lK+1 f/oll 



If we choose Uq ~ uq, then we obtain the following Feynman-Kac interpretation 
of the gradient of a semigroup 

VP„+i(/„+i)(a;) uo = eI Fn+i{Xn+i) J] ^p (-^p) (3^) 

y 0<p<n J 

In the above display, <%"„ is the Markov chain sequence 

Xn '■= {Xn, Un, Wn) 

starting at (x, ug. Wo), and the functions Fn+i and G„ are defined by 

Fn+i{x,u,w) := V/„+i(a;) u and g„{x,u,w) :== \\A„{x,w) u\\ 

In computational physics literature, the mean particle approximations of 
these non commutative Feynman-Kac models are often referred as Resampled 
Monte Carlo methods [92] . 

Roughly speaking, besides the fact that formula ([55]) provides a explicit 
functional Feynman-Kac description of the the gradient of a Markov semigroup, 
the random evolution model [/„ on the unite sphere may be degenerate. More 
precisely, the Markov chain <%"„ = (X„,C/„,VF„) may not satisfy the regularity 
properties stated in section 13.4.11 We end this section with some rather crude 
upper bound that can be estimated uniformly w.r.t. the time parameter under 
appropriate regularity conditions on the reduced Markov chain model (X„, Wn)- 
Firstly, we notice that 

gnix,U,w) -.^ \\An{x,w) u\\ < Gn{x,w) := \\An{x,w)\\ 

■= SUp„g5d-i \\An{x,w) u\\ 



This implies that 

d 



\VPn+l{fn+l){x)\\ := sup 



!<?:<£ 



dx' ^"+i(.^"+i)(^) 



< \\Fn+i\\ xE [] Gp{Xp,Wp) 

\ 0<p<n 

The r.h.s. functional expectation in the above equation can be approximated 
using the particle approximation (J12p of the multiplicative Feyman-Kac formulae 
^, with reference Markov chain (X„, Wn) and potential functions Gn- 
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2.5 Partial observation models 

2.5.1 Nonlinear filtering models 

In this section we introduce one of ttie most important example of estimation 
problem with partial observation, namely the nonlinear filtering model. This 
model has been the starting point of the application of particle models to engi- 
neering sciences, and more particularly to advanced signal processing. 

The first rigorous subject on the stochastic modeling, and the rigorous the- 
oretical analysis of particle filters has been started in the mid 1990's in the 
article |19j . For a detailed discussion on the application domains of particle 
filtering, with a precise bibliography we refer the reader to any of the following 
references m [23, and ^\M\- 

The typical model is given by a reference Markov chain model X„, and some 
partial and noisy observation Y„. The pair process (Xn,Y„) usually forms a 
Markov chain on some product space E^ x E^ with elementary transitions 
given 

p((x„,y„) G d{x,y) I (x„_i,y„_i)) 

(36) 
= Mn{Xn^i,dx) g„{x,y) A„(dy) 

for some positive likelihood function §„ , and some reference probability measure 
A„ on E^ , and the elementary Markov transitions A/„ of the Markov chain X„. 
If we take 

G„{Xn) = PniVnlXn) = gniXn,yn) (37) 

the likelihood function of a given observation !"„ = Vn and a signal state X„ = Xn 
associated with a filtering or an hidden Markov chain problem, then we find that 

Q„ = Law((Xo, . . . ,X„) I VO < p < n Yp ^ yp) 

and 

Zn+l =Pniyo,---,yn) 

In this context, the optimal one step predictor ri„ and the optimal filter ry„ are 
given by the n-th time marginal distribution 

^Ivo.-.y^-i] = ^^^ = Law (X„ | VO < p < n Yp ^ yp) (38) 

and 

^lyo,...,y,.] ^ ff^^ ^ ^^^ (,^^) = Law (X„ | VO < p < n Yp ^ yp) (39) 

Remark 2.1 We can combine these filtering models with the probability restric- 
tion models discussed in section [2.2.2l or with the rare event analysis presented 
in section \2.!A For instance, if we replace the potential likelihood function Gn 
defined in i37\} by the function 



trn(2;n) — gn\Xn,yn) ^An\Xn} 

then we find that 

Q„ = Law((Xo, . . . , X„) I VO < p < n Yp ^ yp, Xp ^ Ap) 
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2.5.2 Approximated filtering models 

We return to the stochastic filtering model discussed in section [2.5. II In some 
instance, the likelihood functions Xn i— >■ gnixmUn) in P7p are computationally 
intractable, or too expensive to evaluate. 

To solve this problem, a natural solution is to sample pseudo-observations. 
The central idea is to sample the signal-observation Markov chain 

x„ = (x„,y„) e E^ = {E'' X E^) 

and compare the values of the sampled observations with the real observations. 
To describe with some precision these models, we notice that the transitions 
of X„ are given by 

M„(X„_i,d(a;, y)) = A/„(X„_i, dx) g„ix,y) A„(dy) 

To simplify the presentation, we further assume that E^ = M'', for some d> 1, 
and we let g be a Borel bounded non negative function such that 



t{u)du = 1 / ug{u) du = and / |u| g(u) du < oo 

Then, we set for any e > 0, and any x = {x, y) <E (E-^ x E^ ) 

9e,n{{x, y),z) = e~'^ g {{y - z)/e) 

Finally, we let (Xn, Y^) be the Markov chain on the augmented state space 
(E^ X E^) = {{E^ X E^) X E^) with transitions given 

P((X„,Y^)ed(x,y) I (X„_i,Y^_i)) 

(40) 
= M„(X„_i,dx) 5e^„(x, y) dy 

This approximated filtering problem has exactly the same form as the one intro- 
duced in p6p . Here, the particle approximation model are defined in terms of 
signal-observation valued particles, and the selection potential function is given 
by the pseudo-likelihood functions g^^ni- ,yn), where y„ stands for the value of 
the observation sequence at time n. 

For a more detailed discussion on these particle models, including the con- 
vergence analysis of the approximated filtering model, we refer the reader to the 
article pTl l¥T] . These particle models are sometimes called convolution parti- 
cle filters [53]. In Bayesian literature, these approximated filtering models are 
termed as Approximate Bayesian Computation (and often abbreviated with the 
acronym ABC). 

2.5.3 Parameter estimation in hidden Markov chain models 

We consider a pair signal-observation filtering model (A, Y) that depend on 
some random variable with distribution ^ on some state space S. Arguing as 
above, if we take 

Ge,n{Xn) =Pn{yn\Xn,0) 
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the likelihood function of a given observation y„ = y„ and a signal state Xn ~ a;„ 
and a realization of the parameter Q — 9, then the n-th time marginal of Q„ is 
given by 

?/e,„ = Law(X„ \yO <p < n Yp ^ yp, 0) 

Using that the multiplicative formula ([5]), we prove that 

Zn+l{0) = PniVO, • ■ • , yn\d) = Y[ Ve,p{'^S,p) 

0<p<n 



with 



VeAGd,p) = p{yp\yo,---,yp~i,0) 

p{yp\xp, 9) dp{xp\9, yo, ■ • ■ , Vp-i) 



Ge,p{xp) rjg^dxp) 
from which we conclude that 

F{Qed9\\fO<p<n Yp = yp) = ^ Z,,{9) n{d9) 



with 



Z„ := J Zn{9) fiid9) 



In some instance, such as in conditionally linear Gaussian models, the nor- 
malizing constants Z„ (9) can be computed explicitly, and we can use a Metropolis- 
Hasting style Markov chain Monte Carlo method to sample the target measures 
(In- As in section r2.2.1l we can also turn this scheme into an interacting Markov 
chain Monte Carlo algorithm. 

Indeed, let us choose a Markov chain Monte Carlo type local moves /i„ = 
UnMn with prescribed target measures 

MnW := -^ 2^(9) fi{d9) 

Notice that 

Zn+l{9) = Zn{9) X r]g,n{Ge,n) ^ A^n+l = *G„ (A^n) 

with the Boltzmann-Gibbs transformations defined in ^ associated with the 
potential function 

Gn{Q) = 11e,n{G0^n) 

By construction, we have 

//„+i = ^„+iM„+i = *G„(/x„)Af„+i 
from which we conclude that 

M„(/)=EJ/(0„) n Gp{9p)\ /EJ n Gp{9 
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with the reference Markov chain 
In addition, we have 

Zn+l - f Zr,{e) Gn{e) flidO) = Z„ fl„{Gn) = [] ^p(^p) 
■^ 0<p<n 

Remark 2.2 for more general models, we can use the particle Markov chain 
Monte Carlo methodology presented in section [2.2.5l When the likelihood func- 
tions are too expensive to evaluate, we can also combine these particle models 
with the pseudo-likelihood stochastic models [JW discussed in section \2.5.'A 

2.5.4 Interacting Kalman-Bucy filters 

We use the same notation as above, but we assume that = (0„),i>o is a 
random sample of a stochastic process 0„ taking values in some state spaces 
Sn- If we consider the Feynman-Kac model associated with the Markov chain 
'^n = (0n,??e,n) and the potential functions 

Gn{Xn) — 11e,n{Ge^n) 

then we find that 

Q„ = Law(eo, . . . , e„ I VO < p < 71 Yp^yp) 

and the n-th time marginal are clearly given by 

rin = Law(8„ | VO < p < n Yp = yp) 

Assuming that the pair (X, Y) is a linear and gaussian filtering model given 
Q, the measures ?7e,n coincide with the one step predictor of the Kalman-Bucy 
filter and the potential functions ^,i (<%"„) can be easily computed by gaussian 
integral calculations. In this situation, the conditional distribution of the pa- 
rameter Q is given by a Feynman-Kac model Q„ of a the free Markov chain Xn 
weighted by some Boltzmann-Gibbs exponential weight function 

Y[ ^pl-^p) ==Pn(yo,---,2/«|0O,---,0n) 
0<p<n 

that reflects the likelihood of the path sequence {Qq, . . . , 0„). For a more thor- 
ough discussion on these interacting Kalman filters, we refer the reader to sec- 
tion 2.6 and section 12.6 in the monograph |25| . 

2.5.5 Multi-target tracking models 

Multiple-target tracking problems deal with correctly estimating several ma- 
noeuvring and interacting targets simultaneously given a sequence of noisy and 
partial observations. At every time n, the first moment of the occupation mea- 
sure Xn := J2i=i ^X' of some spatial branching signal is given for any regular 
function / by the following formula: 

7„(/) := E (A'„(/)) with A'„(/) := J f{x) X„{dx) 
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For null spontaneous birth measures, these measures coincide with that of an 
unnormalized Feynman-Kac model with some spatial branching potential func- 
tions Gn and some free evolution target model X„. 

In more general situations, the approximate filtering equation is given by 
the Malher's multi-objective filtering approximation based on the propagation 
of the first conditional moments of Poisson approximation models [Sni[7D]. These 
evolution equations are rather complex to introduce and notationally consuming. 
Nevertheless, as the first moment evolution of any spatial and marked branching 
process, they can be abstracted by an unnormalized Feynman-Kac model with 
nonlinear potential functions [3 |H1 H] • 

2.5.6 Optimal stopping problems with partial observations 

We consider the partially observed Markov chain model discussed in p6p . The 
Snell envelop associated with an optimal stopping problem with finite horizon, 
payoff style function fn{Xn,Y„), and noisy observations Y„ as some Markov 
process, is given by 

Uk-.^ sup E{U{Xr,Yr)\{Yo,...,Yk)) 

where T^ stands for the set of all stopping times r taking values in {fc, . . . , n}, 
whose values are measurable w.r.t. the sigma field generated by the observation 
sequence Yp, from p = up to the current time k. We denote by ^^y^'-'-^y"-'^^ 
and ^l^"'-'^"' the conditional distributions defined in ([55)1 and (j39p . In this 
notation, for any < fc < n we have that 

nMXr,Yr)\{Y„,...,Yk)) 

(41) 

= ¥.(Fr{YrM^'"-'''-^) I (yo,...,ifc)) 

with the conditional payoff function 

Fp{Yp,^^;''>-''-^)^l ^^— ^''l(d^) fp{Xp,Yp) 

It is rather well known that 

^p ■~ \^p^ 'Ip 
is a Markov chain with elementary transitions defined by 

>^p{dyp) fJ.Mp {gp{.,yp)) Fp {vp,"^ g^(.^y^) {fJ-Mp)) 

A detailed proof of this assertion can be found in any textbook on advanced 
stochastic filtering theory. For instance, the book of W. Runggaldier, L. Stet- 
tner [89] provides a detailed treatment on discrete time non linear filtering, and 
related partially observed control models. 
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Roughly speaking, using some abusive Bayesian notation, we have 
r^[vo,--y.-i](^dxp) = dppixp\{yo,...,yp-i)) 

dpp{xp I Xp-i) xpn{xp-i I {yo,...,yp-i)) 



^iyo,^-^y^~^]Mp{dxp) 



and 



* 



dpp{xp I {yo,...,yp-i)) 



Piyp\xp) 



IPpiVp I x'p) dppix'p I (yo,---,yp-i)) 

= dpp{xp I {yo,...,yp-i,yp)) 
from which we prove that 

fJ-Mpigpi.,yp)) = / Ppiyp \ Xp) dpp{xp \ (yo, • ■ • ,2/p-i)) 

= Ppiyp I iyo,---,yp-i)) 

and 

as soon as ^i = rj^^X"'"'''^ (^ A^^^p = j^l^/o.-.j/p-ilj 

From the above discussion, we can rewrite (PT|) as the Snell envelop of a fully 
observed augmented Markov chain sequence 

E{fr{Xr,Yr)\iYo,...,Yk))^E{Fr{Xr) \ {Xo,...,Xk)) 

The Markov chain Xn takes values in an infinite dimensional state space, 
and it can rarely be sampled without some addition level of approximation. 
Using the iV-particle approximation models, we can replace the chain Xn by the 
A^-particle approximation model defined by 



where 






stands for the updated measure associated associated with the likelihood selec- 
tion functions gp{.,Yp). The A'^-particle approximation of the Snell envelop is 
now given by 

E{fr{Xr,Yr)\{Yo, . . . ,Yk)) o,j,^o. E (F, (X^^) I «, . . . , A-f )) 

In this interpretation, the A^-approximated optimal stopping problem amounts 
to compute the quantities 

C/f := supE(F.(^,^) 1(^-0^,...,^^)) 

where 7^^ stands for the set of all stopping times t taking values in {k, . . . , n}, 
whose values are measurable w.r.t. the sigma field generated by the Markov 
chain sequence X^ , from p = up to time k. 
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2.6 Markov chain restriction models 

2.6.1 Markov confinement models 

One of the simplest example of Feynman-Kac conditional distributions is given 
by choosing indicator functions G„ = Ia^ of measurable subsets An € £n s.t. 
P (VO < p < n Xp £ Ap) > 0. In this situation, it is readily checked that 

Q„ = Law((Xo, ...,Xn)\yO<p<nXpeAp) 

and 

Z„ = P (VO < p < n Xp e Ap) 

This Markov chain restriction model fits into the particle absorption model ^ 
presented in the introduction. For a detailed analysis of these stochastic models, 
and their particle approximations, we refer the reader to the articles [^E51E5] . 
and the monograph |25| . 

2.6.2 Directed polymers and self-avoiding walks 

The conformation of polymers in a chemical solvent can be seen as the realiza- 
tion of a Feynman-Kac distribution of a free Markov chain weighted by some 
Boltzmann-Gibbs exponential weight function that reflects the attraction or the 
repulsion forces between the monomers. For instance, if we consider the histor- 
ical process 

Xn = (^Oi • • ■ ; ^n) 

and 

then we find that 



Gn(X„) — l^{Xp, p<n}{Xn) 



Qn = Law(Xn I VO < p < n Xp e Ap) 

= Law((Xo,...,X„) I VO<p<g<n Xp ^ Xg) 

with the set An = {Gn = l}j ^-nd the normalizing constants 

Zn^FiyO<P<q<n Xp^Xq) 

2.7 Particle absorption models 

We return to the particle absorption model (|3]) presented in the introduction. 
For instance, we can assume that the potential function G„ and Markov tran- 
sitions Mn are defined by G„(a;) = e~^"(^)'', and 

M„(x, dy) = (1 - Kh) (5^(dy) + A„/i Knix, dy) (42) 

for some non negative and bounded function Vm some positive parameter A„ < 
l//i, /i > 0, and some Markov transition /v„. 

We also mention that the confinement models described above can also be 
interpreted as a particle absorption model related to hard obstacles. In branch- 
ing processes and population dynamics literature, the model X^ often represent 
the number of individuals of a given specie [551 EZl IH3- Each individual can 
die or reproduce. The state G En = N is interpreted as a trap, or as an hard 
obstacle, in the sense that the specie disappear as soon as X!^ hits 0. For a 
more thorough discussion on particle motions in absorbing medium with hard 
and soft obstacles, we refer the reader to the pair of articles [241 [25] . 
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2.7.1 Doob h-processes 

We consider a time homogeneous Feynman-Kac model {Gn,Mn) = {G,M) on 
some measurable state space E, and we set 

Q{x,dy) = G{x)M{x,dy) 

We also assume that G is uniformly bounded above and below by some positive 
constant, and the Markov transition M is reversible w.r.t. some probability 
measure /j, on E, with M{x, ■) — n and dM{x, .)/dfi S IL2(/i). We denote by A 
the largest eigenvalue of the integral operator Q on L2, and by h{x) a positive 
eigenvector 

Q{h) = Xh 

The Doob h-process corresponding to the ground state eigenfunction h defined 
above is a Markov chain X!^ with the time homogeneous Markov transition 

rh^^ ^,.^ ._ 1 ., u-U^^r,r^ ,.^u^.^ _ M{x,dy)h{y) 



M\x,dy) := - X h-\x)Q{x,dy)h{y) 



X w-^^ ' ^' v«^ M{h){x) 

and initial distribution r]Q{dx) ex h{x) r]o{dx). By construction, we have G = 
Xh/M{h) and therefore 

r„(d(xo,...,.T„)) = A" ?7o(/i) ¥'^^{d{xo,...,Xn)) -r-f — r 

where Pjj stands for the law of the historical process 

X^ = (X^\...,X,'J) 
We conclude that 

with the normalizing constants 

Zr, = X" f]o{h) E{h~\x';j) 

2.7.2 Yaglom limits and quasi-invariant measures 

We return to the time homogeneous Feynman-Kac models introduced in sec- 
tion [^IZTl] Using the particle absorption interpretation ^ we have 

Law((Xo^ ...,X^„)\T^>n)^ Eih-^iXf^)) ^"' ^^" ^ "*" 

and 

Zn = P (r'= > n) = A" m{h) Eih-\X':)) -^,,^^ (43) 

Letting ?/^ :— Law(X/j'), we readily prove the following formulae 

Vn^'bl/hiVn) and T]il = '^h{'nn) 

Whenever it exists, the Yaglom limit of the measure 770 is is defined as the 
limiting of measure 

Vn ^ntoo Voo = *g(?7oo)M (44) 
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of the Feynman-Kac flow ri„, when n tends to infinity. We also say that 770 
is quasi-invariant measure is we have rjQ = r/„, for any time step. When the 
Feynman-Kac flow ?7„ is asymptoticaUy stable, in the sense that it forgets its 
initial conditions, we also say that the quasi-invariant measure 7700 is the Yaglom 
measure. Whenever it exist, we let 77^ be the invariant measure of the /i-process 
X^. Under our assumptions, it is a now simple exercise to check that 

^700 = *A/(h)(M) and 77^ -.^ */i(?7oo) = */iM(/i)(m) 

Quantitative convergence estimates of the limiting formulae (j43p and ([H]) can be 
derived using the stability properties of the Feynman-Kac models developed in 
chapter 131 For a more thorough discussion on these particle absorption models, 
we refer the reader to the articles of the first author with A. Guionnet |38l 
[55] , the ones with L. Miclo [H [H], the one with A. Doucet [M], and the 
monographs [23 [2S]- 

3 Feynman-Kac semigroup analysis 

3.1 Introduction 

As we mentioned in section 11.41 the concentration analysis of particle models 
is intimately related to the regularity properties of the limiting nonlinear semi- 
group. In this short section, we survey some selected topics on the theory of 
Feynman-Kac semigroup developed in the series of articles |39l[^[5T] . For more 
recent treatments, we also refer the reader to the books ^51 1261 . 

We begin this chapter with a discussion on path space models. Section !??^ is 
concerned with Feynman-Kac historical processes and Backward Markov chain 
interpretation models We show that the the 71-th marginal measures 77„ of 
Feynman-Kac model with a reference historical Markov process coincides with 
the path space measures Q„ introduced in ([T]). 

The second part of this section is dedicated to the proof of the Backward 
Markov chain formulae Q . In section 13.31 we analyze the regularity and the 
semigroup structure of the normalized and unnormalized Feynman-Kac distri- 
bution flows 77„ and 7„. 

Section 13.41 is concerned with the stability properties of the normalized 
Feynman-Kac distribution flow. In a first section, section 13.4.11 we present 
regularity conditions on the potential functions G„ and on the Markov transi- 
tions M„, under which the Feynman-Kac semigroup forgets exponentially fast 
its initial condition. Quantitative contraction theorems are provided in sec- 
tion EXl 

We illustrate these results with three applications related respectively to time 
discretization techniques, simulated annealing type schemes, and path space 
models. 

The last two sections of this chapter, section 13.51 and section 13.61 are con- 
cerned with mean field stochastic particle models and local sampling random 
field models. 
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3.2 Historical and backward models 

The historical process associated with some reference Markov chain Xn is de- 
fined by the sequence of random paths 

X„ = {Xo, . . . ,X„) e En := (^0 X . . . X En) 

Notice that the Markov transitions of the chain X„ is given for any x„_i = 
(cco, . . . , a;„_i) and y„ = (j/o, . . . , ?/„) = (yn-i, 2/™) by the fohowing formulae 

M„(x„_i,(iy„) = 5x„_i(c?yn-i) Mn{yn~i,dxn) (45) 

We consider a sequence of (0, l]-valued potential functions G„ on En whose 
values only depend on the final state of the paths; that is, we have that 

G„ : x„ = {xo, . . . , a;„) e E„ H^ G„(x„) == G„(a;„) G (0, 1] (46) 

with some (0, l]-valued potential function G„ on En- 

We let (7„, rjn) the Feynman-Kac model associated with the pair (G„, M„) 
on the path spaces En. By construction, for any function f„ on En, we have 



7„(f„) = E f„(X„) J] Gp(Xp) 



0<p<n 

fn(^0, • • • J ^n) J_J_ C!p{Xp) 
0<p<n 

from which we conclude that 

7„ = Z„ Q„ and ?7„ = Q„ (47) 

where Q„ is the Feynman-Kac measure on path space associated with the pair 
{Gn,Mn), and defined in ([1]). 

We end this section with the proof of the backward formula ^ . Using the 
decomposition 

2' 

Qnid{xo,...,x„)) = "~ Q„_i(d(a;o,...,a;„_i)) Q„(a;„_i,da;„) 

we prove the following formulae 



and 



rjnidXn) = — I r]n-lQn{dXn) (48) 

r/n-liGn-lHn{.,Xn)) XnidXn) (49) 



Vn- 1 Qri (1 ) = Vn- 1 (G„_ i ) 



-Z„-l 

This implies that 

drjn-lQn , ■, d7]n-2Qn-l , . d7]oQl 

— J [x„) X — (a;„_i) X • • • X — 



^n -^n— 1 ^1 ^ 

X X ■ ■ ■ X — ^n 



jf'n — l ^ri—? -^D 
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Using these observations, we readily prove the desired backward decompo- 
sition formula. 

3.3 Semigroup models 

This section is concerned with the semigroup structure and the weak regularity 
properties of Feynman-Kac models. 

Definition 3.1 We denote by 

^P,ni'r]p) = Vn and 7p(3p,„ = 7„ 

with < p < n, the linear and the nonlinear semigroup associated with the 
unnormalized and the normalized Feynman-Kac measures. For p = n, we use 
the convention <&„_„ = Id, the identity mapping. 

Notice that (5p,„ has the following functional representation 

Qp.n(/n)(Xp) := E I /„(X„) J] Gg{Xg) \Xp=Xp 
\ p<q<n 

Definition 3.2 We let Gp_n and Pp^„ be the potential functions and the Markov 
transitions defined by 

Qp,,(l)(x)=Gp,„(x) and Pp,,,{f) = ^^AD. 
we also set 

Qp^n := sup J''" and (i{Pp,n) = suposc(Pp,„(/)) 

x,y Lrp^n\yj 

The r.h.s. supremum is taken the set of functions Osc{E). To simplify notation, 
for n = p + 1 we have also set 

(^P,P+i = Qp,p+i(l) = (^p 
and sometimes we write gp instead 0/(7^^^+1. 

The particle concentration inequalities developed in chapter [S] will be ex- 
pressed in terms of the following parameters. 

Definition 3.3 For any k,l > 0, we also set 

Tk,i{n) := ^ 5p_„ P{Pp,ny and K{n) := sup {gp,nP{Pp,n)) (50) 

Using the fact that 

llnifn) :== VpQpAfn) /VpQp,n{i) (51) 

we prove the following formula 

$p,« iVp) = *Gp,„ iVp) Pp,n 

for any < p < n. 

As a direct consequence of ([T^ and ([20]) . we quote the following weak reg- 
ularity property of the Feynman-Kac semigroups. 
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Proposition 3.1 For [0,l]-valued potential function Gn, and any couple of 
measures v, fi on the set E s.t. fi{Gp^n) A v{Gp^n) > 0, we have the decom- 
position 

*p,n(M) - *p,n» = -77=; 7 (A^ - !^)^Gp.„,/x-Pp,n 

In addition, we have the following Lipschitz estimates 

||$p,„(m) - %A'^)\U^ < tr "u;"'!r- ^ ^^^r>,n) Ha* - HItv 

and 

sup ||$p,„(A«) - $p,n(l^)||tv = /3(^p,n) 

3.4 Stability properties 

3.4.1 Regularity conditions 

In this section we present one of the simplest quantitative contraction estimate 
we known for the normalized Feynman-Kac semigroups $p^„. We consider the 
following regularity conditions. 

Hm(G, M) There exists some integer to > 1, such that for any n > 0, and any 
{{x, x'), A) G {E^ X £■„) and any n > we have 

Mn.n+m{x, A) < Xm M„.„+,„(x', A) and g = sup5„ < 00 

n>0 

for some finite parameters Xrmg < 00, and some integer to > 1. 

Ho(G,M) 

p ■- sup (g„/3(M„+i)) < 1 and g == sup.g„ < 00 (52) 

n>0 n 

Both conditions are related to the stability properties of the reference Markov 
chain model Xn with probability transition M,,,. They implies that the chain 
Xn tends to merges exponentially fast the random states starting from any two 
different locations. 

One natural strategy to obtain some useful quantitative contraction esti- 
mate for the Markov transitions Pp^„ is to write this transition in terms of the 
composition of Markov transitions. 

Lemma 3.1 For any Q < p < q < n, we have 

P - R'^"'>P and P - /?^"^ /?^"^ ff^"^ /?(") 

^ p.n — J^^p^q^ q.n """■ ^ p,n ~ ^'-p+l^'p+2 ■ ■ • -"-n — 1^'n 

with the triangular array of Markov transitions Rp,q and {Rq )i<q<n defined 
by 

f?{n)(-f\ ^P,q [Gq.nf ) _ I^p,q [Gq^nJ ) 

^p,q\J I —p. (r< \ ^ p (n \ 

^P:Q V^^Qi^i/ P^Q V^^Q,^/ 

and 

E>(n)Cf') — Qpi^P:-n.f) _ Mp{Gp,nf) 
Qp\Gp^n) Mp[Gp_„) 
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In addition, for any 0<p<q<nwe have 

P [r^;:^,) < 9,,n P {Pp.g) and loggp,„< ^ (.9, - 1) /3(Pp,,) (53) 

p<q<7L 

Proof: 

Using the decomposition 

Qp.nif) = Qp.giQg.nif)) = Qp,g (Q?,n(l) PgAD) 

we easily check the first assertion. The Lh.s. inequahty in (|53p is a direct 
consequence of (|23)) . Using (jB]), the proof of the r.h.s. inequahty in ((53|) is based 
on the fact that 



Gp,nix) _ Up<g<n%-ASx){Gq) 



GpAv) Y{p<q<n^vA^v){Gq) 



Using the fact that 



exp^ Yl (log$p,9(4)(G9)-log$p,g((5j,)(G',)) 

p<g<n 






for any positive numbers x, y, we prove that 

G'p,„(y) 

^,,^[^ [' {%AS.){Gq)-%A5y){Gq)) 

P |Z.p<,<„ j^ ^^^^[5y){G,) + t{%.,{5,){G,) - %A5y){Gq)) 

< exp i Yu 5, X ($p,,(J,)(G,) - ^p,q{^y){Gq)) \ 
[p<g<n J 

with 

Gq := Gq/osc{Gq) and 5, := osc(Gg)/inf Gg < .g^ — 1 



We end the proof of the desired estimates using ([23)) . and proDOsition l3.ll This 
completes the proof of the lemma. ■ 



3.4.2 Quantitative contraction theorems 

This section is mainly concerned with the proof of two contraction theorems 
that can be derived under the couple of regularity conditions presented in sec- 
tion [HXlI 



RR n° 7677 



On the concentration properties of Interacting particle processes 45 



Theorem 3.1 We assume that condition Hjn(G,M) is satisfied for some finite 
parameters Xm,g < oo, and some integer m > 1. In this situation, we have the 
uniform estimates 

sup 5p,„<Xrn5'" and sup/?(Pp.p+fc™) < (l - g-^^^^'x™ ) (54) 

0<p<n p>0 ^ ' 

In addition, for any couple of measures i', fJ. (z V{Ep), and for any f G Osc(i?„) 
we have the decomposition 

\[%Al^) - %A^)\{f)\ < Pm (1 - ^^mf'-"^'"' \{p^ - v)Dp,nAf)\ (55) 

for some function Dp^n,ii{f) G Osc(£'p) whose values only depends on the pa- 
rameters {p,n,fi), and some parameters pm < oo and k,„ g]0,1] such that 

p™<x™3'"(l-,9"<""''x™')"' and At™ > g^^^-i^X™ (56) 

Proof: 

For any non negative function /, we notice that 

7->(".) ( f\( ) — Qp,p+m \^p+m,nj ) \X) 
^p,p+m\J)\^) —p. ,p \, \ 

*^p,p-\-m\^^p-\-7n,n J V / 

> n^(™^l)-v~2 J^p,p+m('-^p+m,nJ){x ) 



^^^p,p-\-m\^p-\-m^n)\^ ) 



and for any p + m < n 



^p,n\'^ ) ^p,p-irrn\^p+m,n)\'^ ) ^ m ^^-^p,P'{-m\^p+m,n)\'^ ) ^ j^ 

^p^nv^ ) ^p,p-\-m\^P'\-'ni,n)\'^ ) ^^^p,P'\-m\^p+m,n)\'^ ) 

For p < n < p + m, this upper bound remains vahd. We conchide that 
Gp,„(x) < x™5'" Gp,n(x') and /3 (<;\,„) < 1 - ^"^""''x™ 
In the same way as above, we have 

n — Km =^ ^p,p+km — J^p,p+m^p+m,p+2m ■ ■ ■ ^p+{k-l)m,p+km 

and 



ii{Pp.p+k^) < n ^iKiii-i)m,p+ij < (i 

l<l<k 

This ends the proof of ([SH) . 

The proof of (|55)) is based on the decomposition 



g-^^^-'h^'"' 



{p. - v)SG^,„,tj.Pp,n{.f) = P{SGp,„,^iPp,n) X (// - v)Dp^n,^{f) 

with 

Dp,n.Af) -^ SG„„,^.PpAf)/PiSG,„.,^Pp,n) 

On the other hand, we have 

^G„„,^(x,2/) > (1 - ||Gp,„||) ^ /3(5g„„.^) < \\GpA\ 
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and 

/3(5g,„.,m)/^(Gp,„) < 9p,n < Xrng"" 

Finally, we observe that 

l3{Pp,n) < f3{Pp,p+l^n~p)/m]) 

from which we conclude that 
1 



v{Gp,n) 



l3{SGp^„,tj.Pp,n) < XmQ™' fi{Pp,p+l{n-p)/m\) 



The end of the proof is now a direct consequence of the contraction estimate 
((54)) . This ends the proof of the theorem. ■ 



Theorem 3.2 We assume that condition Ho(G,M) is satisfied for some p < 
1. In this situation, for any couple of measures v, jj, G V{Ep), and for any 
f e Osc(£'„) we have the decomposition 

for some function Dp^n,fi{f) G Osc(_Ep), whose values only depends on the pa- 
rameters {p,n,fi). In addition, for any < p < n, we have the estimates 

(3{Pp,n) < P"-^ and gp,n < cxp {{g - 1) (1 - p^-^/i^ - P)) 

Proof: 

Using proposition l3.1l and recalling that /3(S'g,^_^_p) < ||G„-i||, we readily prove 
that 

\[^„{p)-'^nM]{f)\ < gn-lf3{Mn) \{fi - >^)D.nAf)\ 

< p \{^l - u)D.n.p{f)\ 

with the function 

DnAf) = ^g„_i,a.a/„(/)//3(5g„_i,pM„) e Osc(£;„„i) 

Now, we can prove the theorem by induction on the parameter n > p. For 
n — p, the desired result follows from the above discussion. Suppose we have 

for any / G Osc(_E„_i), and some functions Dp^n-i,fj.{f) €E Osc(_Ep). In this 
case, we have 

< 5„_i/3(M„) |($p,„_i(/i) - $p,„-i(i'))Ai,*p,„_i(M)(/)| 

for any / S Osc(£'„), with D„,*p „_i(,,)(/) e Osc(£'„_i). 
Under our assumptions, we conclude that 
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with the function 

The proof of the second assertion is a direct consequence of proposition [331 s-iid 
lemma 13.11 This ends the proof of the theorem. ■ 



Corollary 3.1 Under any of the conditions Hm(G, M), with m > 0, the func- 
tions Tk^i and K defined in Ii50\) are uniformly bounded; that is, for any k,l > 1 
we have that 

Tk,i{m) := sup sup Tk.i{n) < oo and k(77i) := supK(n) < oo 

n>0 0<p<n n>0 

In addition, for any m > I, we have 

nArn) < m ixn.r)" / (l - (l - 5-<"-''x™')') 

and for in = 0, we have the estimates 

k(0) < exp((g-l)/(l-p)) 
rM(0) < exp(fc(.g-l)/(l-p))/(l-p') 

3.4.3 Some illustrations 

We iUustrate the regularity conditions presented in section [3.4.1l with three dif- 
ferent types of Feynman-Kac models, related respectively to time discretization 
techniques, simulated annealing type schemes, and path space models. 

Of course, a complete analysis of the regularity properties of the 20 Feynman- 
Kac application models presented in section [5] would lead to a too long discus- 
sion. 

In some instances, the regularity conditions stated in section [3.4.11 can be di- 
rectly translated into regularity properties of the reference Markov chain model 
and the adaptation potential function. 

In other instances, the regularity properties of the Feynman-Kac semigroup 
depend on some important tuning parameters, including discretization time 
steps, and cooling schedule in simulated annealing time models. In section [3.4.4[ 
we illustrate the regularity property Ho(G, M) stated in ([5^ in the context of 
time discretization model with geometric style clocks introduced in ([^^ . In sec- 
tion l3.4.5[ we present some tools to tune the cooling parameters of the annealing 
model discussed in (j27p . so that the resulting semigroups are exponentially sta- 
ble. 

For degenerate indicator style functions, we can use a one step integration 
technique to transform the model into a Feynman-Kac model on smaller state 
spaces with positive potential functions. In terms of particle absorption models, 
this technique allows to turn a hard obstacle model into a soft obstacle particle 
model. Further details on this integration technique can be found in |251I29) . 

Last, but not least, in some important applications, including Feynman-Kac 
models on path spaces, the limiting semigroups are unstable, in the sense that 

RR n° 7677 



On the concentration properties of Interacting particle processes 48 



they don't forget their mitial conditions. Nevertheless, in some situations it 
is stiU possible to control uniformly in time the quantities gp^n discussed in 
section 13.31 

3.4.4 Time discretization models 

We consider the potential functions Gn and Markov transitions M„ are given 
by (j42p . for some non negative function l/„, some positive parameter A„ and 
some Markov transition Kn s.t. 

/3(A'„) < K„ < 1 h<hn = {l- Kn)/[vn-i + a] and A„ e]0, l/h] 

with w„_i := osc(y„_i), and for some a > 0. We also assume that v = sup„ u„ < 

CX). 



In this situation, for any A„ e 



1 i_ 



we have 



g„^i/3(M„) < e'^-i" (1 - A„/i (1 - k„)) 

< g-/i(A„(l-K„)-t;„_i) ^ g-"'» 



from which we conclude that Ho(G, M) is met with 

5 = supg„<e''^ and p < e^""^ 



3.4.5 Interacting simulated annealing model 

We consider the Feynman-Kac annealing model discussed in (P7|) . We further 
assume that 

for some fc,i > 1, some e„ > 0, and some measure f„. 
In this situation, we have 

with V := osc(T/^). if we choose 77i„ = fcrJn, this implies that 

/3(A/„) = /3 (a^™^J < /3 (a^^'x)'" < (1 - 6„ e-^"^-)'" 

Therefore, for any given p' e]0, 1[ we can chose Z„ such that 

log(l/p') + z;(/?„-/3„_i) 
"- log 1/(1 -6„ 6-/3"'="-) 

so that 

gn-iPiMn) < e'^(^"-^"-) X (1 - e„ e-^-'^"")'" < p' ^ p < p' 

For any function /3 : a; € [0,oo[i— > P{x), with a decreasing derivative /3'(a;) s.t. 
lim2;^oo /3'(a^) = and /3'(0) < oo, we also notice that 

5 = supg„<supe'"^" < e^'^'^o) 

n>0 n>0 
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3.4.6 Historical processes 

We return to the historical Feynman-Kac models introduced in section 13.21 
Using the equivalence principle (j47p . we have proved that the n-time marginal 
models associated with a Feynman-Kac model on path space coincide with the 
original Feynman-Kac measure (fTj). 

We write Qp.„ and Pp.„ the Feynman-Kac semigrousp defined as Qp,n and 
Pp.n, by replacing (G„, A/„) by (G„,M„). By construction, we have 

Qp,„(l)(xp) = Qp^„(l)(xp) 

for any Xp = (xq, . . . , Xp) € Ep. Therefore, if we set 

^p.ny^p ) • — ^p.n\'^ )\^p) 

then we find that 

gp,„ := sup J^ J = sup J^ , \ = Qp.n 

On the other hand, we cannot expect the Dobrushin's ergodic coefficient of the 
historical process semigroup to decrease but we always have have /3(Pp^„) < 1. 
In summary, when the reference Markov chain X„ satisfies the condition 
Hm(G, M) stated in the beginning of section l3.4.1l for some m > 1, we always 
have the estimates 

gp.« < Xm5" and /3(Pp,„) < 1 (57) 

and 

TMH<(ri + l)(Xm<?")' and n{n)<xmg'^ (58) 

with the functions Tk^i, and k introduced in (j50|) 

We end this section with some Markov chain Monte Carlo technique often 
used in practice to stabilize the genealogical tree based approximation model. To 
describe with some precision this stochastic method, we consider the Feynman- 
Kac measures rjn G 7^ (En) associated with the potential function G„ and the 
Markov transitions M„ of the historical process defined respectively in (j46p and 
in (j45p . We notice that 77„ satisfy the updating-prediction equation 

Vn =*G„(?7n)M„ 

This equation on the set of measures on path spaces is unstable, in the sense 
that its initial condition is always kept in memory by the historical Markov 
transitions Mn. One idea to stabilize this system is to incorporate an additional 
Markov chain Monte Carlo move at every time step. More formally, let us 
suppose that we have a dedicated Markov chain Monte Carlo transition Kn 
from the set £„ into itself, and such that 

In this situation, we also have that 

r?n = *G„(7?„)M:, with M'^:=M^Kn (59) 
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By construction, the mean field particle approximation of the equation (|59p 
is a genealogical tree type evolution model with path space particles on the 
state spaces E„. The updating-selection transitions are related to the potential 
function Gn on the state spaces E„, and the mutation-exploration mechanisms 
from En into En+i are dictated by the Markov transitions MJj_|_j^. 

Notice that this mutation transition is decomposed into two different stages. 
Firstly, we extend the selected path-valued particles with an elementary move 
according to the Markov transition Af„. Then, from every of these extended 
paths, we perform a Markov chain Monte Carlo sample according to the Markov 
transition Kn- 

3.5 Mean field particle models 

3.5.1 Interacting particle systems 

With the exception of some very special cases, the measures r]n cannot be rep- 
resented in a closed form, even on in finite dimensional state-spaces. Their 
numerical estimation using deterministic type grid approximations requires ex- 
tensive calculations, and their rarely cope with high dimensional problems. In 
the same vein, harmonic type approximation schemes, or related linearization 
style techniques such as the extended Kalman filter often provide poor estima- 
tions result for highly nonlinear models. In contrast with these conventional 
techniques, mean field particle models can be thought as a stochastic adap- 
tive grid approximation scheme. These advanced Monte Carlo methods take 
advantage of the nonlinearities of the model, so that to design an interacting 
selection- recycling mechanism. 

Formally speaking, discrete generation mean field particle models are based 
on the fact that the flow of probability measures r]n satisfy a non linear evolution 
equation of the following form 

r]n+i{dy) ^ rin{dx)Kn+i.,,,„{x,dy) (60) 

for some collection of Markov transitions il'„-(_i ,,, indexed by the time parameter 
n > and the set of probability measures T' (£"„). 

The choice of the McKean transitions Kn+i,ri„ is not unique. For instance, 
we can choose 

Kn+i^,,^{x,dy) = ^n+i{rin){dy) 

and more generally 

Kn+i^r,„ix,dy) 

= e{i]n)Gnix) Mn+iix,dy) + (1 - e{rin)Gnix)) $„+i(?/„)(rfy) 

for any e{r]n) s.t. e{rjn)Gn{x) G [0,1]. Note that we can define sequentially a 
Markov chain sequence (X„)„>o such that 

P {Xn+i e dx I Xn) = Kn+i,iu {^n,dx) with Law(X„) = rjn 

From the practical point of view, this Markov chain can be seen as a perfect 
sampler of the flow of the distributions (j60p of the random states Xn- For a 
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more thorough discussion on these nonhnear Markov chain models, we refer the 
reader to section 2.5 in the book |25j . 

The mean field particle interpretation of this nonlinear measure valued model 
is the E^ -valued Markov chain 

with elementary transitions defined as 

N 

P(C„+i edx\^^) = Y[ K„+,^^N{C,dx') (61) 

with 

In the above displayed formula, dx stands for an infinitesimal neighborhood 
of the point x = {x^,...,x'^) £ En+i- The initial system ^g consists of N 
independent and identically distributed random variables with common law 770- 

We let 0^ := a {£,0, . . . ,^n) be the natural filtration associated with the 
A^-particle approximation model defined above. 

The particle model associated with the parameter e{r]n) ~ 1 coincides with 
the genetic type stochastic algorithm presented in section 11.41 

Furthermore, using the equivalence principles (|T7)) presented in section [X^ 
we can check that the genealogical tree model discussed above coincides with 
the mean field A'^-particle interpretation of the Feynman-Kac measures {"jmVn) 
associated with the pair (G„,M„) on the path spaces E„. In this context, we 
recall that ry„ = Q„, and the iV-particle approximation measures are given by 

1 ^ 

"^n ■■=j^Y. \co „,«; „,...,?; „) ^ ^(^") = PiEoX...x i?„) (62) 

3.6 Local sampling errors 

The local sampling errors induced by the mean field particle model (j6ip are 
expressed in terms of the empirical random field sequence V^ defined by 



v^.^Vn K+i-a>„+i(7y^)] 



Notice that V^^i is alternatively defined by the following stochastic perturbation 
formulae 



„N _ ^ /N\ , 1 T^JV 



Ty^Vi = $„+i {vn ) + -^ K;i (63) 



For n = 0, we also set 



v,^ = Vn K - Vo] ^ V^ -vo + ^ V, 



N 

N ' 



In this interpretation, the iV-particle model can also be interpreted as a 
stochastic perturbation of the limiting system 

77„+i = $„+i (?7„) 



RR n° 7677 



On the concentration properties of Interacting particle processes 



52 



It is rather elementary to check that 

E(K^i(/)|0 = 
E {V^i.iff I C ) - '7,"^ [^«+i,,« (/ - K^+i,,- (/))' 

Definition 3.4 We denote by af^ the uniform local variance parameter given by 
al := sup/i (X„,^ [/„ - K^.Mnt) < 1 (64) 

In the above displayed formula the supremum is taken over all functions fn G 
Osc(£'„), and all probability measures fi on En, with n > 1. For n ~ 0, we set 

al= sup ?/o([/o-%(/o)]^) < 1 
/oeOsc(Bo) 

We close this section with a brief discussion on these uniform local variance 
parameters in the context of continuous time discretization models. When the 
discrete time model Kn.^i = Kn,li, comes from a discretization of the continuous 
time model with time step At = /i(< 1), we often have that 

Kn.^ = Id + hLn^^ + 0{h^) (65) 

for some infinitesimal generator L„_p. In this situation, we also have that 

Lk„^ := K,,^^ -Id^ hLn,f,. + 0{h^) 

For any Markov transition K, we notice that 

K{[f-K{fr) = K{f^)-K{ff 

- LK{f^)^2fLK{f)^{LK{f)f 
= TLALf)-{LK{f)f 

with the carre du champ" function F^^, (/, /) defined for any x ^ E hy 

TL,AfJ)i^) - LK{[f-LKif){x)]')ix) 

= LK{f^){x)-2f{x)LK{f){x) 
When K ~ Kn.^i and Lk„ ~ hLn,^ + 0{h^), we find that 



A* 



/v„.^. [/„ - KnAfn)?] = l^^L^-„^^^ (/, f)h-h^ ^i{LK„^^ iff) 

= h^i{TL^JfJ))+ Oik') 



4 Empirical processes 

4.1 Introduction 

The aim of this chapter is to review some more or less well known stochastic 
techniques for analyzing the concentration properties of empirical processes as- 
sociated with independent random sequences. The discussion at the start of 
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this section provides some basic definitions on empirical processes associated 
with sequences of independent random variables on general measurable state 
spaces. In section 14.21 we state and comment the main results of this section. 
Section 14.2.11 is concerned with finite marginal models. In section 14.2.21 we 
extend these results at the level of the empirical processes. Besides the fact 
that the concentration inequalities for empirical processes holds for supremum 
of empirical processes over infinite collection of functions, these inequalities are 
more crude with greater constants than the ones for marginal models. These 
two sections also contains two new perturbation theorems that apply to non- 
linear functional of empirical processes. The proofs of these theorems combine 
Orlicz's norm techniques, Kintchine's type inequalities, maximal inequalities, as 
well as Laplace-Cramer-Chernov estimation methods. These 4 complementary 
methodologies are presented respectively in section H751 section H^ section H751 
and section l476l 

Let (/i*)i>i be a sequence of probability measures on a given measurable 
state space {E,£). During the further development of this section, we fix an 
integer A^ > 1. To clarify the presentation, we slightly abuse the notation and 
we denote respectively by 

N N 



"'(^^ = 1^H^^' ^'^'^ ^' = — y]^^" 



i=l 4=1 



the A^-empirical measure associated with a collection of independent random 
variables^ = (X')i>i, with respective distributions (//*)j;>i, and the A^-averaged 
measure associated with the sequence of measures (/^*)i>i. We also consider the 
empirical random field sequences 



V{X) = VN {m{X) - fi) 
We also set 

N 

N 



1 ^ 

a(/)2 := E {V{X){fr) = - ^ ^^([/ - /.X/)]') (66) 



Remark 4.1 The rather abstract models presented above can be used to analyze 
the local sampling random fields models associated with a mean field particle 
model discussed in section \3.(A 

To be more precise, given the information on the N -particle model at time 
(n — 1), the sequence of random variables S,^ are independent random sequences 
with a distribution that depends on the current state ^^„i. That is, at any given 



fixed time horizon n and given Gn-ir ''^^ have 



X''^ea^E = E,, and y}{dx) - K^^^M_^{£,l_^,dx) (67) 

In this case, we find that 

m{X)^rj^ and V{X) = V,^ 
and 

a{ff = E(K'li(/)^|0 
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Let J^ be a given collection of measurable functions / : _E ^- K such that 
11/11 < 1- We associate with T the Zolotarev seminorm on ?-'(£') defined by 

||//-^||^ = sup{|//(/)-i.(/)|;/e^}, 

(see for instance [H]). No generality is lost and much convenience is gained 
by supposing that the unit and the null functions f ~ 1 and f — Cz J-. 
Furthermore, to avoid some unnecessary technical measur ability questions, we 
shall also suppose that T is separable in the sense that it contains a countable 
and dense subset. 

We measure the size of a given class J^ in terms of the covering numbers 
N{e, 7^,1^2 {n)) defined as the minimal number of L2(ju)-balls of radius e > 
needed to cover J-'. We shall also use the following uniform covering numbers 
and entropies. 

We end this section with the last of the notation to be used in this chapter 
dedicated to empirical processes concentration inequalities. 

Definition 4.1 ByAf{e,J-), e > 0, and by I{J-) we denote the uniform covering 
numbers and entropy integral given by 

N{e,F) = su^{N{e,FM(ri));iieV{E)] 

Jo 

The concentration inequalities stated in this section are expressed in terms of 
the inverse of the couple of functions defined below. 

Definition 4.2 We let (eo,ei) be the functions on M+ defined by 

eo(A) = i(A-log(l + A)) and ei(A) = (1 + A) log (1 + A) - A 
Rather crude estimates can be derived using the following upper bounds 

€^^{x) <2{x + y/x) and e];^{x) < ^ + V2x 

A proof of these elementary inequalities and refined estimates can be found in 
the recent article |44| . 

4.2 Statement of the main results 

4.2.1 Finite marginal models 

The main result of this section is a quantitative concentration inequality for the 
finite marginal models 



f^ViX)if)^VN{m{X) -(,{/)) 

In the following theorem, we provide Kintchine's type mean error bounds, 
and related Orlicz norm estimates. The detailed proofs of these results are 
housed in section [4.41 The last quantitative concentration inequality is a direct 
consequence of ([TO)) , and it is proved in remark WM 
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Theorem 4.1 For any integer m > 1, and any measurable function f we have 
the hm-mean error estimates 

E{\V{X){f)rV^"^ < b{m) (osc(/) A [2 ^(1/1"?/'"']) (68) 

and 

E{\V{X){f)n^ < &b{mf max(V2a(/), 



2a{fY^"' 



with the smallest even integer m' > m, and the collection of constants b{m) 
defined in (f^l). 

In particular, for any f e Osc(E), we have 



^^{V{X){f)) < ^3/8 (69) 

and for any N s.t. 2a'^(f)N >\ we have 

E(|y(X)(/)r)"< 6V2 6(m)V/) (70) 

In addition, the probability of the event 

\V{X){f)\<&V2cj{f)[l + e-\x)) 

is greater than 1 — e^^ , for any x >Q. 

In section H. 3.21 dedicated to concentration properties of random variables Y 
with finite Orlicz norms t:^{Y) < 00, we shall prove that the probability of the 
event 

Y<TT4,{Y) Vy + log2 

is greater than 1 — e^^, for any y > (cf. lemma 221) • This implies that the 
probability of the events 



\V{X){f)\ < - v/3(a. + log2)/2 

is greater than 1 — e^^, for any a; > 0. 

Ou next objective is to derive concentration inequalities for nonlinear func- 
tional of the empirical random field V{X). To introduce precisely these objects, 
we need another around of notation. 

For any measure i>, and any sequence of measurable functions / = (/i , . . . , fd), 
we write 

K/):=M/i),...,^(/rf)] 

Definition 4.3 We associate with the second order smooth function F on M. , 
for some d > 1, the random functionals defined by 

f = {.f.)i<.<d e Osc(^)'' 

(71) 
^ F{m{X){f)) = F{m{X){fi), ..., m{X){fa)) G M 

Given a probability measure u, and a collection of functions {fi)i<i<d € Osc(_E)^, 
we set 

i5.(F)(/) = VF(K/))/^ (72) 
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Notice that 

d 



OF 



We also introduce the foUowing constants 

-(K/)) 



'' d^F 






du^du^ 



(73) 



In the r.h.s. display, the supremum is taken over all probability measures v S 

The next theorem extend the exponential inequalities stated in theorem 14. II 
to this class of nonlinear functionals. It also provide more precise concentration 
properties in terms of the variance functional a defined in (j66p . The proof of 
this theorem is postponed to section H771 

Theorem 4.2 Let F he a second order smooth function on W^ , for some d > 
1. For any collection of functions {fi)i<i<d G Osc(i<^)'', and any N > 1, the 
probability of the events 

[F{m{X){f)) - Fi^lif))] 
^^ ll^'^/lli [3/2 + 6o^(x)] 



2N 



^J x\\VF{^^{fml \ 



is greater than 1 — e~^ , for any x > 0. In the above display, D^{F){f) stands 
for the first order function defined in i72\} . 

4.2.2 Empirical processes 

The objective of this section is to extend the quantitative concentration theo- 
rems, theorem 14. H and theorem 14.21 at the level of the empirical process associ- 
ated with a class of function T. These processes are given by the mapping 



feT^ V{X){f) = ^N {m{X) - ^l{f)) 

Our main result in this direction is the following theorem, whose proof is post- 
poned to section H3] 

Theorem 4.3 For any class of functions T , with I{T) < oo, we have 

2 



7r^{\\V{X)y)<l2^ / 0og(8+AA(.F,e)2)de 
Jo 

Remark 4.2 Using the fact that log {8 + x^) < A log x , for any x > 2, we obtain 
the rather crude estimate 

f v/log(8+AA(J-,e)2) de<2 f v/logAA(J-, e) de 
Jo Jo 

We check the first observation, using the fact that 9{x) = 41oga; — log (8 -t- x'^) 
is a non decreasing function on M+, and 0(2) = log (4x4) — log (4 x 3) > 0. 
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Various examples of classes of functions with finite covering and entropy integral 
are given in the book of Van der Vaart and Wellner [HI] (see for instance p. 
86, p. 129, p. 135, and exercise 4 on p. 150, and p. 155). The estimation 
of the quantities introduced above often depends on several deep results on 
combinatorics that are not discussed here. 

To illustrate these mathematical objects, we mention that, for the set of 
indicator functions 

-^ = {^nui-oo..^.] ; i^^)i<^<d e K'} (74) 

of cells in E = M.'^, we have 

^fi^,T) < c id + l){4ef+^ e-2d 
for some universal constant c < oo. This implies that 



0og AA(e, T) < y^log [c{d + l)iAe)d+^ + ^J2d) ^log (1/e) 



An elementary calculation gives 

r2 



/ Vlog (1/e) < 2 / x^e-'^'dx = ^^4 < 1 
Jo Jo 



10 Jo 

from which we conclude that 

c2 



/ VlogAA(J-, e) de < 20og [c{d + l)(4e)'^+i] + ./{UJ < c'Vd (75) 



'0 

for some universal constant c < oo 



For d = 1, we also have that N{e, F) < 2/e^ (cf. p. 129 in [HI]) and therefore 

/ v/logAA(J-, e) de < 3V2 
Jo 



Remark 4.3 In this chapter, we have assumed that the class of functions T is 
such that supj-gjr Ij/ll < 1. When supjg_p||/|| < cjr, for some finite constant 
cjr , using theorem \4-.S[ it is also readily checked that 

7r^(||y(X)||^)<122 / v'log(8+AA(^,e)2)& (76) 

Jo 

We mention that the uniform entropy condition /(J-") < cx) is required in 
Glivenko-Cantelli and Donsker theorems for empirical processes associated with 
non necessarily independent random sequences |43| . 

Arguing as above, we prove that the probability of the events 



im^)ii^</i(-^) v/^+iog2 

is greater than 1 — e~^, for any x > 0, with some constant 



'0 
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As for marginal models 1711 our next objective is to extend theorem 14.51 to 
empirical processes associated with some classes of functions. Here, we consider 
the empirical processes 

/ e J-, ^ m{X){f) e R 

associated with d classes of functions Fi, I < i < d, defined in section HTT] We 
further assume that ||/i|| V osc(/.;) < 1, for any fi G J-i, and we set 



J-:= II T, and n^{\\V{X)y) := sup ^^(||1/(X)||^J 

l<i<d — — 

Using theorem I4.3[ we mention that 

n^{\\V{X)y) < 12^ / Vlog(8+AA(J-,e)2)de 

Jo 



with 

7V(J", e):= sup A/'(J'j,e) 

l<i<d 



We set 



VF^||^:=sup 



dF 



and V F = sup 



d'^F 

-Hf)) 



du^dui 



The supremum in the l.h.s. is taken over all 1 < i < d and all f ^ F] and the 
supremum in the r.h.s. is taken over all 1 < i,j < d, ly £ ^{E), and all f G F. 
We are now in position to state the final main result of this section. The 
proof of the next theorem is housed in the end of section 14.71 

Theorem 4.4 Let F be a second order smooth function on M'', for some d > 1. 
For any classes of functions Fi, 1 < i < d, and for any a; > 0, the probability of 
the following events 

sup^g^|F(m(X)(/))-F(M/))| 
<-^7r^(||y(X)i|^) i|VF^|L(l + 2Vi) 

+^ II v'^L (d M\\vix)y)f (i + eo 1 (I)) 

is greater than 1 — e~^ . 

4.3 A reminder on Orlicz' norms 

In this section, we have collected some important properties of Orlicz' norms. 
The first section, section [4.3. II is concerned with rather elementary comparison 
properties. In section 14.3.21 we present a natural way to obtain Laplace esti- 
mates, and related concentration inequalities, using simple Orlicz' norm upper 
bounds. 
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4.3.1 Comparison properties 

This short section is mainly concerned with the proof of the following three 
comparison properties. 

Lemma 4.1 For any non negative variables (Yi,Y2) we have 

as well as 

(Vm > E (Yi^™) < E {Yi"")) =^ 7r^,(Yi) < n^iY^) (77) 

In addition, for any pair of independent random variables {X, Y) on some mea- 
surable state space, and any measurable function f , we have 

(7r^(/(x,y)) < c for P-a.e. x) =^ T:^{f{X,Y)) < c (78) 

Proof: 

The first assertion is immediate, ad the second assertion comes from the fact 
that 

The last assertion comes from the fact that 

E(E(V'(/(X,y)/c) \X))<\^ Mf{X,Y)) < c 
This ends the proof of the lemma. ■ 



4.3.2 Concentration properties 

The following lemma provides a simple way to transfer a control on Orlicz' 
norm into moment or Laplace estimates, which in turn can be used to derive 
quantitative concentration inequalities 

Lemma 4.2 For any non negative random variable Y , and any integer in > 0, 
we have 

E (r^™) < ml 71^,(^)2" and E (r^m+i) < (^ + 1)1 7r^^(y)2m+i (79) 
In addition, for any t > we have the Laplace estimates 

E(e*^) < min(2 e^(*^*(^»' , {1 + tn^Y)) e(*^^(^»' 
In particular, for any x > the probability of the event 



Y <Tr^(Y) V2^ + log2 (80) 

is greater than 1 — e^^. 
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Remark 4.4 For a Gaussian and centred random variable Y, s.t. E{Y'^) ~ 1, 
we recall that n^ (Y) = y^8/3. In this situation, letting y ~ y^8{x + log2)/3 in 
i80\}. we find that 

^i\Y\ >y)<2 e-^i 



-i I y' 



tY\ 



.tV2 



Working directly with the Laplace Gaussian function E (e 

move the factor 3/4. In this sense, we loose a factor 3/4 using the Orlicz's 

concentration property Ii80\) . 

In this situation, the l.h.s. moment estimate in i79\} takes the form 



h{2mf 



(2m)! ^_, 



< m! (8/3)' 



(81) 



while using Stirling 's approximation of the factorials we obtain the estimate 
(2m)! 



r\2 



^27^ 4'" (< (8/3)™) 



Remark 4.5 Given a sequence of independent Gaussian and centred random 
variables Yi, s.t. E{Y^^) = 1, for i > 1, and any sequence of non negative 
numbers Oi, we have 



while 



TT^ ^ a,Y, = VWS 



^a2:=v/873||a|| 



Y,a^^^4,{y^) = V^ !]«»:= VsTsli all 1 



Notice that 



lall2<lla|li <^/n.\\a\\ 



When the coefficients a.i are almost equal, we can loose a factor ^Jn using the 
triangle inequality, instead of estimating directly with the Orlicz norm of the 
Gaussian mixture. In this sense, it is always preferable to avoid the use of the 
triangle inequality, and to estimate directly the Orlicz norms of linear combina- 
tions of "almost Gaussian " random variables. 

Now, we come to the proof of the lemma. 
Proof of lemma 14. 2t 

For any tti > 1, we have 



n>l 



^ m ' 



E 



Y 



^^{Y) 



2m^ 



^ 



< m! E V 



77l! 1p{x) 



Y 

MY) 



< m\ 



For odd integers, we simply use Cauchy- Schwartz inequality to check that 



Y 



2m+l\^ 



<E y^ 



') E ('y2(m+l)A < (to + 1)12 ^^^(Y) 



2(2m+l) 
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This ends the proof of the first assertion. 

We use Cauchy-Schwartz's inequahty to check that 

E(r2™+i)' <E(r2") ]E(y2(™+i)^ < (m + l)!2 7r^(F)2(2'"+i) 

for any non negative random variable Y, so that 

E(y2'»+i) < {m + l)l 7r^,(y)(2™+i) 

Recahing that (2m)! > m!^ and {m + 1) < (2m + 1), we find that 

^2m 4-2m+l 



^ ^ ^ (2m ! ^ ' ^ 2m,+ l ! ^ 

m>0 ^ ' m>0 ^ ' 

j-2m 4-2rn+l 



2)ri+l\ 



?7i! ■^ — ' nv. 

m>0 m>0 



= {l + tTr^{Y)) exp{tTT^{Y)f 
On the other hand, using the estimate 

^Y^l^l^m] ( V2Y_\ it7r4Y))\ f Y ^^ 



V2 ; \MY)J ~ 4 V^v(^) 

we prove that 

E(e-)<2exp^(^-^'(^»^ 



V 4 

The end of the proof of the Laplace estimates is now completed. To prove the 
last assertion, we use the fact that for any y > 

P{Y>y) < 2 exp (- sup {ty - {tn^{Y))y4)] 

V t>Q J 



= 2 exp 
This implies that 



iy/My)r 



'Y > TT^ (Y) Vx + log2j < 2 exp [- (x + log 2)] = e"^ 
This ends the proof of the lemma. ■ 

4.3.3 Maximal inequalities 

Let us now put together the Orlicz's norm properties derived in section H751 to 
establish a series of more or less well known maximal inequalities. More general 
results can be found in the books [751 IHI] ; or in the lecture notes [SB] . 

We emphasize that in the literature on empirical processes, maximal in- 
equalities are often presented in terms of universal constant c without further 
information on their magnitude. In the present section, we shall try to estimate 
some of these universal constants explicitly. 

To begin with, we consider a couple of maximal inequalities over finite sets. 
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Lemma 4.3 For any finite collection of non negative random variables (Yi)i^i, 
and any collection of non negative numbers (ai)i^j, we have 



supE(V'(i;/ai)) < 1 ^ E maxY, < V'^^d^l) x max a, 
iei \ *e-f / *e-f 

Proof: 

We check this claim using the fohowing estimates 

/ E(max.,,y.) A < ^,(EL..iYM)) 

< EU(max(r,/a,))j 

< Er£i;{{Yja,))\ < \I\ 

This ends the proof of the lemma. ■ 

Working a little harder, we prove the following lemma. 

Lemma 4.4 For any finite collection of non negative random variables (Yi)i^i, 
we have 

n^ I maxFj I < ^6 log (8 + |/|) max7r^(yi) 

\ iel J i£l 

Proof: 

Without lost of generality, we assume that max^g/ TT^{Yi) < 1, and / = {!,. ..,|/|}. 
In this situation, it suffices to check that 

/max,<.<|,|y, \ / ^^ Y. \^ 

\^6\og{8 + \I\)J \l<^<\I\^6\og{8 + ^)J 

Firstly, we notice that for any i > 1 and x > 3/2 we have 

1 111 

< : - + -. 7:-7-r < 3 



log (8 + i) log x - log 9 log (3/2) 

and therefore 

3 log (8 + i) log (.t) > log (x(8 + i)) 

We check the first estimate using the fact that 

log(3) < 51og(3/2) => log(3) + log(3/2) < 61og(3/2) < 31og(3/2)log(9) 
Using these observations, we have 

P niaxi<,<|r| f , ^' I > logx 

RR n=^r7(^^^l<'<m V21og^(8+.)) > \ 
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This implies that 



2 



If we set 



then we have 



nmxi<„<|,| ( y,,J(,^^) j >logx 



2 yl 1 2 /-"^ 1 1 

- ^ ^ (8 + i)2 - ^ ig ^ ■(2^ 



Z/ := exp ■^ I max — , = 

\i<^<\i\ y/6\og{8 + i)l 



/■CX) 

E{Zi) = / ¥{Zi > x) dx 
Ja 



3 /"^ 1 , 3 / 1\ 15 „ 
< - + / - — -T7 dx = -\l + -\= — <2 
- 2 J. {2xY 2\ a) 8 - 



and therefore 



lb max — < 1 

\i<^<\i\ \^6\og{8 + i)J J 

This ends the proof of the lemma. ■ 

The following technical lemma is pivotal in the analysis of maximal inequal- 
ities for sequences of random variables indexed by infinite but separable subsets 
equipped with a pseudo-metric, under some Lipschitz regularity conditions w.r.t. 
the Orlicz's norm. 

Lemma 4.5 We assume that the index set (/, d) is a separable, and totally 
bounded pseudo-metric space, with finite diameter 

d{I) := sup d{i,j) < oo 
(i,i)e/2 

We let {Yi)iQi be a separable and R-valued stochastic process indexed by I 
and such that 

MY^-Yj)<cd{i,j) 

for some finite constant c < oo. We also assume that Yi^ ~ 0, for some iq G /. 
Then, we have 

( \ fdii) , 

TT^ supy, <12c / V61og(8+AA(/,d,e)2) de 
Kiel J Jo 

Proof: 

Replacing Yi by Yi/d{I), and d by d/d{I), there is no loss of generality to assume 
that d{I) < 1. In the same way, Replacing Yi by Yi/c, we can also assume that 
c < 1. For a given finite subset J C /, with io £ J, we let Jk = {ii, . . . ,in^} C J, 
be the centers of Uk = J^{J, d, 2^^) balls of radius at most 2~'' covering J. For 
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A: = 0, we set Jq = {io}- We also consider the mapping 6k '■ i £ J ^-^ dki'i) £ Jk 
s.t. 

supd{9k{i),i) <2-'= 

i<£j 

The set J being finite, there exist some sufficiently integer fcj s.t. d{6k{i), i) ~ 0, 
for any k> k*j\ and therefore Yi = Yq^u\, for any i € J, and any k > kj. This 
implies that 

k* 
k=l 

We also notice that 

d(0fe(i),0fc_i(z)) < d(0fc(i),^) + d(i,0fe_l(z)) < 2-^= + 2^('=-i) = 3 X 2"'= 

and 

sup TT^ (y, - Yj) < 3 X 2"'' 

(ij)e(JfcXjfc_i) : d(lj)<3x2-'= 



Using lemma UU we prove that 



TT^ supyj < ) ' TT^ sup Ye^^(j) - Yefc_i(o) 




< 3 y^ y^61og(8+AA(J,d,2-fc)2) 2-^= 
fc=i 


On the other hand, we have 


2 (2-^ - 2-(^+i)) = 2-^- 


and 

.2-'= 


j61og(8+A/'(J,d,2-'=)2) 2-'^'<2 / V61og(8+AA(J,d,e)2) 

* J2-<'= + i) 


from which we conclude that 


/ \ /•1/2 





TT,^ (supF, ) < 6 / v/61og(8+AA(J,d,e)2) dt 
\ie.J J Jo 

Using the fact that the e-balls with center in / and intersecting J are necessarily 
contained in an (2e)-ball with center in J, we also have 

A/'( J, d, 2e) < A/'(/, d, e) 

This implies that 

t:^ f supF,) < 12 / V61og(8+AA(/,d,e)2) de 
\ieJ / Jo 

The end of the proof is now a direct consequence of the monotone convergence 
theorem with increasing series of finite subsets exhausting /. This ends the 
proof of the lemma. ■ 
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4.4 Marginal inequalities 

This section is mainly concerned with the proof of the theorem 14. II This resuh 
is a more or less direct consequence of the following technical lemma of separate 
interest. 

Lemma 4.6 Let Mn := X]o<i3<n ^p ^^ ^ ''^'^^ valued martingale with symmetric 
and independent increments (A„)„>o. For any integer m > 1, and any n > 0, 
we have 

E (|M„r) ™ < bim) E ([Af]:'/') "^ (82) 

with the smallest even integer m' > m, the bracket process 

0<p<n 



and the collection of constants b{m) defined in \24-^ . In addition, for any m > 2, 
we have 

E(|M„r)-<6(m)y(^7TT) [-1^ Y. IE(|Apr"')| (83) 

Proof of theorem l4. It We consider a collection of independent copies X' = 
(X")i>i of the random variables X = (X*)i>i. We consider the martingale 
sequence M = {Mi)i<ci<N with symmetric and independent increments defined 
for any 1 < J < A^ by the following formula 



Mi := 

^ /N . , 

?— 1 



-^J2 [fixi-f{x'^)] 

By construction, we have 

1 ^ 
V{X){f) = -y=Y^ {f{X^)-t/{f))^E{M^\X) 

"^^* 4 = 1 

Comibing this conditioning property with the estimates provided in lemma [ 
the proof of the first assertion is now easily completed. 

The Orlicz norm estimate ((M|) come from the fact that for any / e Osc(E), 
we have 

E(|1/(X)(/)|^") < b{2m.f"' = E(C/2™) 

for a Gaussian and centred random variable U, s.t. E{U'^) ~ 1. Using the 
comparison lemma, lemma 1771 we find that 

7r^{ViX){f))<7r^iU) = ^8/3 

Applying Kintchine's inequalities (|82p . we prove that 

^ ,m'/2\ 1/™' 

N 



Ei\V{X){f)n^ < b{m)E 



1 ^ 

Y,[f{x^)-fix'^)y 
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By construction, we notice that for any / G osc(E), and any p > 2, we have 



1 ^ 
-Y.^[[f{X^) ^ fiX'^)Y) <2a{f)' 



By the Rosenthal type inequahty stated in theorem 2.5 in |63| . for any se- 
quence of nonnegative, independent and bounded random variables {Yi)i>i, we 
have the rough estimate 



E 



N 



H^r 



,i=i 



i/p 



N 



< 2pmax ^E{Yi), 



i=l 



N 



^E(y/ 



1/pN 



for any p > 1. If we take p = m' /2, and 

1 



Y,. = - [/(XO-/(X'^)]- 



we prove that 



E 



Eti[/(^0-/(^")]' 



AT 



m' /2 



2/rn 



2-1 2/m' 



< 4mmax(^2a(/)2, -^ [2a(/)2] 

for any / G osc(E). Using Stirling's approximation of factorials 

V2^ n" e-" < n! < e ^2^ n" e"" 
for any p > 1 we have 

{2p)P/b{2pfP = 22Pp>!/(2p)! < eP+i < 3^^ 
and 

(2p+ 1)^+1/2/6(2^+ 1)2^+^ = (2p+ l)P+i2>!/(2p+ 1)! < 6^+^ < S^^+i 
This implies that 

m^/VK'")" < 3" =^ V^ fo(TO) < 3&(m)2 
for any m > 1. This ends the proof of the theorem. 



Now, we come to the proof of the lemma. 
Proof of lemma 14. 6t 

We prove the lemma by induction on the parameter n. The result is clearly 
satisfied for n = 0. Suppose the estimate (|5^ is true at rank (n — 1). To prove 
the result at rank n, we use the binomial decomposition 



2m 



(M„_i + A„r =^( J )Aer (A„) 
p=0 ^ 



^rn \ , ^2m—p /A \p 
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Using the symmetry condition, all the odd moments of A„ are null. Conse- 
quently, we find that 



El 



((A/„_i+A„)-')=^( 2™ ) E(Aer^)) E(A^^) 



p=0 



Using the induction hypothesis, we prove that the above expression is upper 
bounded by the quantity 

E,"Lo ( X ) ^"^""'^ (2(m-p))(_,) E([il/]™_7) E(A^P) 



To take the final step, we use the fact that 

2p J ^^^"^ P))im-P) - 2-P (2p)p (^ p 



E([il/]:_7) E{Alr') 



2m+l 



and (2p)p > 2^, to conclude that 

771 

E ((A/„_i + A„)''") < 2-'" (2m)„ ^ 

= 2-'" (2m)™E([M]::) 
For odd integers we use twice the Cauchy-Schwarz inequality to deduce that 

E(|M„|^'"+^)2 < E(Af2™) E(A'/2(™+i)) 

< 2-(^'"+i) (2m)„, (2(m + f ))(,,„+i) E ([A/]™' 
We conclude that 

E(|Af„|2'"+i) < 2-(™+V2) (2m+^)^™+i) ^ A^^j,„+iy-.(.^ 

a/to + f /2 ^ ^ 

The proof of (|5^ is now completed. Now, we come to the proof of (|83p . For 
any m' >2 we have 

- m' /2 



0<p<n 



0<p<T7 



and therefore 



72 



1 



IE([M]™/^j'- <(n + l)V2 l_^ ^ e(|A,|™' 

\ 0<p<r7 

This ends the proof of the lemma. 
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4.5 Maximal inequalities 

The main goal of this section is to prove theorem 14.31 We begin with the 
basic symmetrization technique. We consider a cohection of independent copies 
X' = (X'*)i>i of the random variables X = (X')i>i. Let e = (ei)i>i constitute 
a sequence that is independent and identically distributed with 

P(£i = +l) = P(ei = -l) = l/2 

We also consider the empirical random field sequences 

V,{X) ■.= yfN m,{X) 

We also assume that {e,X,X') are independent. We associate with the 
pairs (e,X) and {t,X') the random measures TOc(X) = -^ X]i=i ^i ^x^ and 

We notice that 

\\m{X)~^,r-, = sup|m(X)(/)-E(m(X')(/))r 

< E(||m(X)-m(X')||^ |X) 

and in view of the symmetry of the random variables {f(X^) — /(X'*))j>i we 
have 

E{\\m{X) - ™(X')||^) - E(||m,(X) - m,(X')|l^) 

from which we conclude that 

Ei\\ViX)r^)<2PEi\\V,{X)r^) (84) 

By using the Chernov- Hoeffding inequality for any a;^, . . . , x^ G E, the empirical 
process 

f ^ V,ix)if) := ^ m,{x)if) 

is sub-Gaussian for the norm ||/||L2(m(a:)) = ^TT'{x)if^Y ■ Namely, for any 
couple of functions /, g and any S > we have 

E [[V,{x){f) - V,{x){g)f) = 11/ - .g|l£,(„(,)) 

and by Hoeffding's inequality 

P{\V,{x){f)-V,{x){g)\ >S) < 2 e"^*''/"^-»"'2(™(x)) 

If we set Z = I ^,, ",„ — I , then we find that 

VV6|l/|lL2(™(x))y 

/•OO 

E (e^) - 1 = / e* ¥{Z>t) dt 
Jo 

e*p(|y,(x)(/)|>V6t||/|lL,(™(.))) dt 

/•OO 

< 2 / e* e-3* dt = 1 

"'0 
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from which we conclude that 

TT^ (K(x)(/) - V,{x){g)) < Veil/ - .g||L.(™(.)) 

Combining the maximal inequalities stated in lemma H31 and the condition- 
ing property ([78)l we find that 



with 

J{I') < 2 6^ /" 0og(8+A/'(J',e)2) de < c I{F) < oo 

for some finite universal constant c < oo. Combining ([M)l with (|77)) . this implies 
that 

This ends the proof of the theorem. ■ 



4.6 Cramer-Chernov inequalities 

4.6.1 Some preliminary convex analysis 

In this section, we present some basic Cramer-Chernov tools to derive quantita- 
tive concentration inequalities. We begin by recalling some preliminary convex 
analysis on Legendre-Fenchel transforms. We associate with any convex func- 
tion 

L : te Doni(L) ^ L{t) e R+ 

defined in some domain Dom(L) C M+, with L{0) = 0, the Legendre-Fenchel 
transform L* defined by the variational formula 

VA>0 i*(A) := sup (At-L(t)) 
teDom(L) 

Note that L* is a convex increasing function with L*{0) = and its inverse 
(L*)~ is a concave increasing function. 

We let La be the log-Laplace transform of a random variable A defined on 
some domain Dom(L^) C M+ by the formula 

LA(t):=logE(e*^) 

Holder's inequality implies that La is convex. Using the Cramer-Chernov- 
Chebychev inequality, we find that 



logP(A> A) <-L^(A) and ¥ (a > (L^y^x)) 



<e-^ 



for any A > and any x > 0. 

The next lemma provides some key properties of Legendre-Fenchel trans- 
forms that will be used in several places in the further development of the 
lecture notes. 
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Lemma 4.7 • For any convex functions (Li,L2), such that 

Vi e Dom(L2) Li{t)<L2{t) and Dom(L2) C Dom(Li) 

we have 

Ll<Ll and {L\)-^ < [LlY^ 

• if we have 

\/t G u"^Dom(i2) = Doni(Li) Li(t) = u L2{v t) 

for some positive numbers {u,v) G M^, then we have 

Ll{X)=uL*(—) and {L\)-\x) ^ uv (L*)-^ ^ "^ 



u 
for any A > 0, and any Va; > 0. 

• Let A be a random variable with a finite log-Laplace transform. For any 
a CzR, we have 

LA{t) = -at + LA+a{t) 

as well as 

L\{\)^L\^^{\ + a) and {L\)-' {x) = -a+ {L\^:)~\x) 

We illustrate this technical lemma with the detailed analysis of three convex 
increasing functions of current use in the further development of these notes 

• L{t) =iV(i-i), te [0,1[ 

• Lo{t) ■.= -t-\ log (1 - 2t), t e [0, l/2[. 

• Li(i) := e* - 1 -i 

In the first situation, we readily check that 

^'W=(^-l and ^"W=(T^ 
An elementary manipulation yields that 



*(a) = (VaTT-i)' 



L 

and 

{L*y^ {x) = (1 + ^/x}^ -l^x + 2^/i 

In the second situation, we have 

io(i) = T^-l and L'^{t)= ^ 



l-2t ""■ ' (l-2t)2 

from which we find that 

iS(A) = ^(A-log(l + A)) 
RR n° 7677 



On the concentration properties of Interacting particle processes 71 



We also notice that 

for every t e [0, 1/2 [ Using lemma H771 we prove that 

I^(A) = ^L*{2X)<LUX) 
mr'ix) < (r,y\x)^liL^)-'i4x)^2ix + V^) (85) 

In the third situation, we have 

L'i(t) = e*-1 and L'l{t) ^ e* 
from which we conclude that 

Lt(A) = (l + A)log(l + A)-A 

On the other hand, using the fact that 2x3^ < {p + 2)l, for any p > 0, we prove 
that we have 



in *V'' CY -in- '" 


2 {3 


'^'^'^- 2 2-.ip + 2y. UJ -'^^^'^- -2(1-^3) 


for every t e [0, l/3[. This implies that 




^;(A) - I L* (^) < LUX) 
2 V 3 / 




and therefore 





{LI)-' (x) < (Tl) '" (x) = \ {L*)-' (^f ) = (I + ^^) (86) 

Another crucial ingredient in the concentration analysis of the sum of two 
random variables is a deep technical lemma of J. Bretagnolle and E. Rio [H2]- In 
the further development of this chapter, we use this argument to obtain a large 
family of concentration inequalities that are asymptotically "almost sharp" in 
a wide variety of situations. 

Lemma 4.8 (J. Bretagnolle &; E. Rio [82j) For any pair of random vari- 
ables A and B with finite log-Laplace transform in a neighborhood ofO, we have 

yx > {L\+sr'i^) < iL\)-\x) + {L*Br\^) m 

We also quote the following reverse type formulae that allows to turn most 
of the concentration inequalities developed in these notes into Bernstein style 
exponential inequalities. 
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Lemma 4.9 For any {u,v) £ R+, we have 

with the functions 

a{u, v) := ( 2m + — j and b{u, v) := ( v2 u + vj 
and the Laplace function 

La,b{t)^^L{at) with LIAX)> 



2a? ' ' """'■ ' - 2{b + Xa) 

Proof: 

Using the estimates (j85p and ([BS]) we prove that 

u my^ (x) + V {Liy^ (x) < 2u{x + yG)+ V (I + \/2^) 

= a{u,v)x+ \/2x b{u,v) 

with 

a(u, u) := I 2m H — ) and &(m, v) :— I v2 m + m ) 

Now, using lemma H771 we observe that 

ax + V2xb = {L*,^y (x) with La^it) = ^^ L (at) 

Finally, we have 

(A/2)2 



L*(A)= K/A+T-l > 



(l + A/2) 
The r.h.s. inequality can be easily checked using the fact that 

V J \ (1 + X) + VT+2X J 

(1 + A) {^^^T2X<il + X)) 



A 
> 



This implies that 



* , . b ^ /2a \ A2 



2a2 \b J - 2{b + Xa) 
This ends the proof of the lemma. 
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4.6.2 Concentration inequalities 

In this section, we investigate some elementary concentration inequalities for 
bounded and chi-square type random variables. We also apply these results to 
empirical processes associated with independent random variables. 

Proposition 4.1 Let A be a centred random variable such that A < 1. If we 
set a A = E(A^)^/^, then for any i > 0, we have 

LA{t) < (t\ Li{t) (89) 

In addition, the probability of the following events 

A < <7\{Ll)-^(^j <^ + c7AV2i 

is greater than 1 — e^^ , for any a; > 0. 

Proof: 

To prove (|M)) we use the fact the decomposition 

E (e*^ -l-A)=E (Li(tA)lx<o) + E (ii(iA)lxe[o,i]) 

Since we have 

Va; < Li{tx) < {txf /2 

and 

V.T e [0, 1] Li{tx) =x^J2 x^^^t'^/nl < x^Li{t) 

ri>2 

we conclude that 

E(e*^) < l + jE{A'lA<o) + L,{t)E{AHAeio,i]) 
< 1 + Li(t)ffi <e-^i(*)'^^ 
Using lemma H771 we readily prove that 

iL\rHx) < 4 (Lrr' (^) < ^ + ^^ ^/2^ 

This ends the proof of the proposition. ■ 



Proposition 4.2 For any measurable function f , with < osc(f) < a, any 
N > 1, and any t > 0, we have 

L^viX)U)^t) < N <y\f/a) L,{at) (90) 

In addition, the probability of the following events 



V{X){f) < a-^<j^f)VN {Lir' ( 



\Na^f) 



xa 



< -— + ,/2Mfr (91) 



is greater than 1 — e ^ , for any x > 0. 
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Proof: 

Replacing / by //a, there is no loss of generality to assume that a = 1. Using 
the same arguments as the ones we used in the proof of proposition l4.1l we find 
that 

iogE(e*(/(^')-^'(/)))< ^^' {[f - f^xnf) iiW 

from which we conclude that 



Lj,{t) := logE(e*^^(^)(^)) 

N 

= ^logE(e*(-^'(^')-^'(-^'») <LN{t) -.^N a^if) Li{t) 

i=l 

By lemma l¥771 we have 

This ends the proof of the proposition. ■ 

Proposition 4.3 For any random variable B such that 

E (|sr")^/"' < b{2mf c with c < oo 
for any m > 1, with the finite constants b{m) defined in \24-^ , we have 

LB{t)<ct + Lo{ct) (92) 

for any < ct < 1/2. In addition, the probability of the following events 



B <c 



1 + {L*y^ {x)] < c [1 + 2(a; + y/^)] 



is greater than 1 — e ^ , for any a; > 0. 

Proof: 

Replacing B by B/c, there is no loss of generality to assume that c = 1. We 
recall that 6(2to)^'" = E([/^'") for every centred Gaussian random variable with 
E(C/2) = 1 and 

fm 1 

^* ^ [0' 1/2) E -^ ^(2-)"" = 7T=^ = E(c-P {tU'}) 



?TT,>0 



This implies that 



E(exp HB}) < V — r 6(2m)2'" 



m>0 

for any < t < 1/2. In other words, we have 

LB-i{t) :=logE(exp{t(B-l)}<Lo(i) 
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and 

LB{t)^t + LB-i{t) <t + Loit) 

from which we conclude that 

LbW = iB-i(A - 1) ^ [L^b)-' (x) = 1 + (L*B^'' {x)<l + (L*)-' (x) 



This ends the proof of the proposition. ■ 

Remark 4.6 We end this section with some comments on the estimate h70\) . 
Using the fact that b{m) < b{2m) (see for instance Ii25\} ) we readily deduce from 
^ that 

E{\V{X){f)n^ < 6V2b{2m)Mf) 

for any m > 1, and for any N s.t. 2a'^{f)N > 1. Thus, if we set 

B = \V{X){f)\ and c ^ 6^2 a{f) 

in proposition \4-^ we prove that for any N s.t. 2a'^{f)N > 1, and for any 
< t < l/(12VTa(/)) 

L\vix)if)\it) < 6\/2 a{f) t + Lo(6%/2 a(/) t) 

In addition, the probability of the following events 

\ViX)if)\ < 6V2aif)[l + iLt,)-'ix) 
< 6V2 <j{f) [I + 2{x + ^)] 

is greater than 1 — e~^ , for any a; > 0. 

When N is chosen so that 2a^{f)N > 1, using H91\} we improve the above 
inequality. Indeed, using this concentration inequality implies that for any f G 
Osc{E), the probability of the following events 

V{X){f) < V2a{f) (l + V^) (93) 

is greater than 1 — e~^ , for any x > 0. 

4.7 Perturbation analysis 

This section is mainly concerned with the proof of theorem l4.2l and theorem l4.4l 
We recall that for any second order smooth function F on M'', for some d > 1, 
F{m{X){f)) stands for the random functionals defined by 

/ = {f^)l<^<d e OSC(^)'^ 

^ F{m{X){f)) ^ F{m{X){fi), ..., m{X){fa)) G M 

Both results rely on the following second order decomposition of independent 
interest. 

RR n° 7677 



On the concentration properties of Interacting particle processes 76 

Proposition 4.4 For any N > I, we have the decomposition 

1 



N [F{m{X){f)) - Fi^l{m =. ViX) [D,iF){f)] + -^ i?(X)(/) 



with a first order functional Z3^(_F)(/) defined in i72\}. and a second order term 
i?(X)(/) such that 

E{\R{X){f)rf"' < ^b{2mf\\V^Ffl 

for any m > 1, with the parameter defined in i73\). 

Proof: 

Using a Taylor first order expansion, we have 

N [F{miX)if)) F(m(/))] = VF{f,{f)) V{X)iff + -^ R{X)if) 



witii the second order remainder term 

Rix){f) 

:= 1^(1 t)V{X){f) V^F {tm{X){f) + (1 - f)^(/)) V{X){fY dt 
We notice that 

VF(/i(/)) V{X){fy = V{X) [VF(m(/)) /^] 
and 



^MD) 



osc(VF(M(/))n<^ 
It is also easily checked that 

E(|i?(x)(/)r)^/'" 

< 5Et=i ^P.ePls) |al^M/))| ^{\V{X){f,)V{X){f,)rf''^ 
and for any 1 < i^j < d, we have 

This ends the proof of the proposition. 



We are now in position to prove theorem 14.21 
Proof of theorem I4.2t 

We set 

N[F{m{X){f))-F(p{f))]=A + B 

with 

A = VNV[X)[D^{F){f)] and B ^ R{X){f) 
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Combining proposition 14.21 with proposition 14.31 if we set 

g = D,{F){f) a=||VF(A.(/))||i and c=^\\y'Ff\\^ 
then we have 

LAit) < Na^{g/a)Li{at) 

Lsit) = ct + LB-c{t) with LB-c{t) < Loict) 

On the other hand, we have 

{L\r'{x)<Naa^g/a){Ll)-'(^-^ 

LB{t)^ct + LB-c{t) 



Na^{g/a) 
and using the fact that 



we prove that 

iB(A)=iB-c(A-c)^ {L*s)-'{x) = c+{L*B_:)-\x) 

< c{l + {L*)-'{x)) 

Using Bretagnolle-Rio's lemma, we find that 
iL\^B)-\^) < iL\)-\x) + iL*sr\x) 

< N a- a^ig) {L\)-' (^^) + <: (l + (L^)"^ (..)) 

This ends the proof of the theorem. ■ 

Now, we come to the proof of theorem 14.41 
Proof of theorem I4.4t 

We consider the empirical processes 

/ e j:, K^ m{X)U) e R 

associated with d classes of functions J-i, 1 < i < d, defined in section ITT] We 
further assume that ||/i|| V osc{ fi) < 1, for any fi G Ti, and we set 

7r.4\\V{X)\\^) := sup 7rv,(r(X)||^J 

l<i<d 

Using theorem 14.31 we have that 



7r^{\\V{X)y) < 12^ / v/log(8+AA(-F,e)2)de 

Jo 

with 

A/'(J",e) := sup AfiT^e) 
i<i<d 

Using proposition 14.41 for any collection of functions 

d 

f = (/j)i<i<de^:=n-^'' 

1=1 
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we have 

VNsnpf^^\F{m{X){f))-F{fi{f))\ 

< !lvf,IL Eti mx)\y^ + ^ IIv^fL Eti mx)\\% 

If we set 

^:=iiv^,ilEii^wii. 

then we find that 






7rv,(A)<||V^^IL^x4lin^)||^J 

By lemma 521 this imphes that 

E(e*^) < (l + tTT^iA)) e('"'*(^»' < e'^'+^*''' 
with b — 2a} and 

Notice that 

iA-a(t) < Lit) = -t^fo 



Recalling that 



\2 

L*(A) = — and (L*y^ (x) = V2te 



we conclude that 

{L\)-\x) = a+{L\_X\^) 

< a + V2bx^TT^{A){l + 2y/x) 

Now, we come to the analysis of the second order term defined by 

, d 

B = ^^\\\/^F\\ y\\V{X)\\l 

Using the inequality 

/ d \ ™ d 



E«0 <^™-^E' 



a,- 

\i=l / 4=1 



which is valid for any d >, any tti > f, and any sequence of real numbers 



(ai)i<Kd <= K+, we prove that 



E(B"') < /3™ d"~i^E(||y(X)|'^''" 



l:^i 
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with 



2\/]v" " 



Combining lemma l¥^ with theorem 14. 3[ we conchide that 

E(B™)<77i! (/3d7r^(||\/(X)|b)2)'" 



If we set 

then we have that 



b:=^d7r^i\\V{X)y) 



2 



E (e*^) < J2 (^^)'" = -^— = e'* X e2^°(^*/2) 



?n>0 



for any < t < 1/6 with the convex increasing function Lq introduced on 
page [701 so that 

2Lo{bt/2) = -bt - log (1 - bt) 

Using lemma H771 we prove that 

Ls-bit) < 2Lo{bt/2) 
and 

iLir'ix) = b+{Ll_,)-'{x) 

< Ki + (^ornf)) 

i= IIV^Fll^ id 7:^{\\V{X)y)f (l + (L*)-' (I)) 



2VN 
Finally, using the BretagnoUe-Rio's lemma, we prove that 

<d7r^{\\V{X)\\^) [|lVF^|U(l + 2yi) 

+ iTF ll^'^L id M\\v{x)y)) (i + mr' (f ; 

This ends the proof of the theorem 14.41 ■ 

5 Interacting empirical processes 

5.1 Introduction 

This short chapter is concerned with the concentration analysis of sequences of 
empirical processes associated with conditionally independent random variables. 
In preparation for the work in chapter IH] on the collection of Feynman-Kac 
particle models introduced in section FOl we consider a general class of interac- 
tion particle processes with non necessarily mean field type dependency. 
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Firstly, we analyze the concentration properties of integrals of local sampling 
error sequences, with general random but predictable test functions. These 
results will be used to analyze the concentration properties of the first order 
fluctuation terms of the particle models. 

We also present a stochastic perturbation technique to analyze the second 
order type decompositions. We consider finite marginal models and empirical 
processes. We close the chapter with the analysis of the covering numbers and 
the entropy parameters of linear transformation of classes of functions. 

We end this introductory section, with the precise description of the main 
mathematical objects we shall analyze in the further development of the chapter. 

We let Xn = {Xn ' )i<i<Ar be a Markov chain on some product state 
spaces E^ , for some A'^ > 1. We also let Q^^ be the increasing cr-field generated 
by the random sequence {Xp )o<p<n- We further assume that {Xn ' )i<i<Af 
are conditionally independent, given Gn-i- 

As traditionally, when there is no possible confusion, we simplify nota- 
tion and suppress the index (.)^^'' so that we write {Xn,X^,Qn) instead of 

In this simplified notation, we also denote by ^^ the conditional distribution 
of the random state X^ given the Gn~i', that is, we have that 

Notice that the conditional distributions 

1^ . 



/^"•=1^2^^" 



N 

i=l 

represent the local conditional mean of the occupation measures 

1 ^ 

i=l 

At this level of generality, we cannot obtain any kind of concentration prop- 
erties for the deviations of the occupation measures m(X„) around some deter- 
ministic limiting value. 

In chapter[6l dedicated to particle approximations of Feynman-Kac measures 
77„, we shall deal with mean field type random measures /i*^, in the sense that 
the randomness only depends on the location of the random state X^_i and on 
the current occupation measure m{Xn-i). In this situation, the fluctuation of 
m{Xn) around the limiting deterministic measures rjn will be expressed in terms 
of second order Taylor's type expansions w.r.t. the local sampling errors 



ViXp) =. VN {m{Xp) - tip) 

from the origin p = 0, up to the current time p — n. 

The first order terms will be expressed in terms of integral formulae of pre- 
dictable functions fp w.r.t. the local sampling error measures V{Xp). These 
stochastic first order expansions are defined below. 
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Definition 5.1 For any sequence of Qn-i-measurable random function fn G 
Osc(£'„), and any numbers an G IR+, we set 

n 

K(X)(/)=^ap V(Xp)(/p) (94) 

For any CJ„_]^ -measurable random function /„ G Osc(_E„), we have 

E{V{Xn)ifn)\gn-l) = 

, N 
E {ViXnKfnf IGn-l ) = <7^ifnf ^^ T7 E ^ ( [/" " t^nifn) 



i=l 

We also assume that we have an almost sure estimate 



sup (Jni.fn)'^ — '^n ^^^ some positive constant a-^ < 1. (95) 

N>1 

5.2 Finite marginal models 

We will now derive a quantitative contraction inequality for the general random 
fields models of the following form 

Wn{X)if) = K(X)(/) + -^ Rn{X){f) (96) 



with Vn{X){f) defined in ((94|) . and a second order term such that 
E(|i?„(X)(/)|™)'/™<6(2m)2c„ 

for any m > 1, for some finite constant c„ < oo whose values only depend on 
the parameter n. 

For a null remainder term Rn{X){f) = 0, these concentration properties are 
easily derived using proposition 14.21 

Proposition 5.1 We let Vn{X){f) be the random field sequence defined in ^U^. 
For any t > 0, we have that 

L^v„(x)if)(t) < ^ ^l Li{tal) 
with the parameters 



ol 



E CTp and a* := max Op 

^ — ^ ^ 0<p<n 

a<p<n 



In addition, the probability of the following events 

VniX)if) < VNalwl (L^y^f^ 

( X I 

< a*n —^ + V2ct2 X 

is greater than 1 — e~^, for any x > 0. 
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Proof: 

By proposition 14.21 we have 

with 

logE (e(*<^")^^(^")(-^") |^„_i) < N cjI L,{tal) 

This clearly implies that 

^vWv„(x)(/)W < Li{t) := ^ ^' L,{tal) 
Using lemma 14771 we conclude that 

= TV a: al {LIT' (-^ 

The last assertion is a direct consequence of ((86)) . This ends the proof of the 
proposition. ■ 



Theorem 5.1 We let Wn{X){f) he the random field sequence defined in ([£ 
In this situation, the probability of the events 



N W^{X){f) < c„ (l + {Lir' (x)) + N al ol {L\r^ [^ 

is greater than 1 — e^^ , for any a; > 0. In the above display, ct,i stands for the 
variance parameter definition in proposition \5.1\ 

Proof: 

We set -JN W„(X)(/) = A„ + B„, with 



A„ = V7VK(^)(/) and B„ = i?„(X)(/) 
By proposition 14.31 and proposition [STTJ we have 

and 

iB„-c„(i) < Ls^-cAt) := Loicnt) 
We recall that 

LB„it) = c„t + LB„-cAt) ^ L*bJX) = L*B^^,JX - c„) 

Using lemma H771 we also have that 

< C„+(l;^_,„) \x) = Cn (l + (L^or' (x)) 
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In the same vein, arguing as in the end of the proof of proposition 15. 1[ we have 

(^*v^y„(X)(/))~'(^) < Na:,aliL*r'^-^^ 

The end of the proof is now a direct consequence of the BretagnoUe-Rio's lemma. 
This ends the proof of the theorem. ■ 

5.3 Empirical processes 

We let Vn{X) be the random field sequence defined in (jM)) . and we consider a 
sequence of classes of t/„_]^-measurable random functions J^„, such that ||/n|| V 
osc(/„) < 1, for any /„ € J"„. 

Definition 5.2 For any f = {fn)n>a G ^ := (^ri)?i>0; o^nd any sequence of 
numbers a = (a„)„>o G R^, we set 

n 

F„(X)(/) = ^ ap V{X,){fp) and \\V„.{X)y = sup |K(^)(/)| 

We further assume that for any n > 0, and any e > 0, we have an almost sure 
estimate 

AA(J-„,e)<A/-„(e) (97) 

for some non increasing function J\fn (e) such that 

bn :- 12' / ^/hiiS+NjTf) & < oo 
Jo 

In this situation, we have 

n 
TT^ (IIK(X)II^) <Y.^P ^'^ {\\^iXp)\\r) 

Using theorem 14.31 given Qn-i we have the almost sure upper bound 
^^ {\\V{Xp)\\^^) < 122 j y^iog(8+AA(J-p,6)2) de < bp 

Combining lemma WTH and lemma W^ we readily prove the following theorem. 

Theorem 5.2 For any classes of Qn-^i-measurable random, functions Tn satis- 
fying the entropy condition i97}) . we have 

n 
TT^ (llK(X)lljr) < C„ := ^ Opbp 
p=0 

In particular, the probability of the events 

||K(^)||^<c„ V2; + log2 

is greater than 1 — e"^, for any x > 0. 
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Next, we consider classes of non random functions J^ = iJ'n)n>a- We further 
assume that j|/„|| V osc(/„) < 1, for any /„ e J>t, and 

/i(J'):=122 f 0og(8+AA(J',e)2) de < oo 
Ja 

with 

A/'(J', e) = supA/'(J'„, e) < oo 

ri>0 

Theorem 5.3 We let Wn{X)(f), / G J^, be the random field sequence defined 
by 

WniX){f) = V.n{X){J) + -^ i?„(X)(/) 

with a second order term such that 



Elsup|i?„(X)(/)r"l <m!C 

for any m > 1, for some finite constant c„ < cxj whose values only depend on 
the parameter n. In this situation, the probability of the events 



l|W^n(^)ll^< 



.P=0 



/:(^)(l + 2/F) + -^(l + (L*r^(|)) 



is greater than 1 — e ^ , for any a; > 0. 

Proof: 

We set \/N II W„(X)||^ < v4„ + B„, with 



A„ = x/TV ||K(^)||^ and B„ = sup |i?„(X)(/)| 
Using the fact that 

n 

sup|K(X)(/)|<^ap im^p)ll^^ 
by lemma WaX we have 



/6^ p=0 



TTv. (liK(^)IU) < E«p '^^ (ll^(^p)ll^, 

p=0 



Using theorem 14.31 we also have that 



^i' (ll^(^p)II^J < h{:F) ■■= 12' / Vlog(8+AA(^,e)2) de 



with 

A/'( J", e) = sup7V(J"„, e) 

ri>0 
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This implies that 

n 

Tr-4, (An) < On Vn Ii{F) with a„ := ^ Op 

p=0 

By lemma H?^ we have 

E(e*^") < {l+tTr.^{An)) e(*"^(^"»' < e""*+^*'^" 

with 

Pn = '2an and a„ = 7r^(A„) 

Notice that 



Recalling that 



LA„-a„it)<Ln{t):=-t^l3n 



K{\)^^ and {L*^)-^ {x) ^ ^/2p:i 



we conclude that 

< a„ + V2A^==7^v,(A„)(l + 2^/^) 
On the other hand, under our assumption, we also have that 

for any < ^ < 1/cn with the convex increasing function Lq introduced on 
page [701 so that 

2Lo{Cnt/2) = ~Cnt - log (1 - c„i) 

Using lemma l477l we conclude that 

Ls^-cAt) <2LoM/2) 
and 

{L^bJ~\x) = Cr, + {Ll^^J^\x) 
< c„(l + (L*)-i(|)) 

The end of the proof is now a direct consequence of the BretagnoUe-Rio's lemma. 
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5.4 Covering numbers and entropy methods 

In this final section, we derive some properties of covering numbers for some 
classes of functions. These two results are the key to derive uniform concentra- 
tion inequalities w.r.t. the time parameter for Feynman-Kac particle models. 
This subject is investigated in chapter [S] 

We let {En,£n)n=Q.i be a pair of measurable state spaces, and J^ be a sep- 
arable collection of measurable functions / : £'1 — ^ M such that ||/|| < 1 and 
osc(/) < 1. 

We consider a Markov transition M{xo,dxi) from Eq into Ei, a probability 
measure fi on Eq, and a function G from Eq into [0, 1] . We associate with these 
objects the class of functions 

G-M{T) = {GM{f) : /e-F} 

and 

G-iM- ^iM)(F) = {G [A/(/) - /iA/(/)] : / € -F} 



Lemma 5.1 For any e > 0, we have 

N[G-M{T),e] <AA(J-,e) 

Proof: 

For any probability measure rj on Eq, we let {/i,.-.,/nj} be the centers of 
n, =AA(J-,L2(?7),e) 

]Li2(^)-balls of radius at most e covering J- . For any f Cz J-, there exists some 
1 < i < rig such that 



v{[G{f-m') <v{[{f-m 

This implies that 

AA (G • ^, L2 (r,) , e) < M{^, L2 (r/) , e) 

In much the same way, we let {/i, . . . , /„^} be the n^ = M{F, L2(?7M), e) centers 
of L2 (7yM)-balls of radius at most e covering F. In this situation, for any f E J', 
there exists some 1 < i < nc such that 

rj ([(M(/) - Mim') "^ < rjAI ([(/ - /,)]') '^' < e 
This implies that 

AA (A/(^),L2(77), e) < AA(^,L2Mf), e) 
This ends the proof of the lemma. ■ 

One of the simplest way to control the covering numbers of the second class 
of functions is to assume that Af satisfies the following condition M{x,dy) > 
di>{dy), for any x € Eq, and for some measure v, and some d e]0, 1[. Indeed, in 
this situation we observe that 

..(.-, M{x,dy)^5v{dy) 

Ms{x, dy) = 

i — 
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is a Markov transition and 

(1 - 6) [Ms{f)ix) - Ms{f)iy)] = [M{f){x) - M{f){y)] 
This implies that 

(1 - S) [Ms{f)ix) - ^x^h{f)] = [M{f){x) - ^iMif)] 
and 



V 



(A/(/)(x) - mA/(/))' < 2(1 - 5) vMsAlfl^) 



with the Markov transition 

Ms,^{x, dy) ^ - [Msix, dy) + nMs{x, dy)] 

We let {/i, ...,/„ J be the n, = A/'(J',L2 (vMs^f,) , e/2) centers of L2 {riMs,^,)- 
balls of radius at most e covering F. If we set 

J^M{f)^f,M{f) and J ^ ^ M {/,) - ^iM {/,) 

then we find that 

from which we prove that 



/-^ -r \ 2 



1/2 



(/-/O <2(l-<5) [r;M,,^(|/-/.n] 



1/2 



We conclude that 

M m - tiM){F), Ui-n), 2e(l - 5)) < JV{T, U {r]Ms,^,) , e) 

and therefore 

M {{M - fiM){T), 2e(l - S)) < Af{T, e) 

or equivalently 

^ (l^ ^^'^ ' mA^)(-^), e) < AA(J-, 6/2) 

In more general situations, we quote the following result. 
Lemma 5.2 For any e > 0, we have 

N [G ■ {M - fiM){r}, 2e/3(M)] < AA( J", e) 

Proof: 

We consider a Hahn- Jordan orthogonal decomposition 



(Mix, dy) - fiM{dy)) = M+{x, dy) - A/" (x, dy) 
M+(x,dy) = (A/(x, .)-M^)+ and M-{x,dy) = {M{x, .) - ^M)' 
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with 

|jM(x, .) - A^Mll,, = M;tix,E,) = M-ix,E,) < P{M) 

By construction, we have 

M(/)(x) - ^iM{f) = AI+ix,E,) (m+(/)(x) - M;(/)(x)) 



with 



+ M+{x,dy) 

M^ {x, dy) := — q- — — - and M^ (x, dy) := 



M+(x,^i) 



M-{x,dy) 
M^-(x,^i) 



This imphes that 



with 



\M{f){x)-^JLM{f)\ < 2/3(M) M^(|/|)(x) 
1 / — 



M,Xx, dy) = - (aiI{x, dy) + M^ {x, dy) 



One concludes that 



■n 



1/2 



{M{f){x)-^JLM{f)Y <2/3(Af) [r?M^(|/|2)] 



2m1/2 



We let {/!,...,/„,} be the n,, = 7V( J", L2 (ryM,,) , e/2) centers of L2 (r^M^)- 
balls of radius at most e covering F. 
If we set 

'f^M{f)-fiMif) and J ^ = M (f,) - ^lM {f,) 

then we find that 

7-7. = M(/-/^)-A^A/(/-/«) 

from which we prove that 



V 



{f-f,YY^'<2p{M) [^M,{\f~M^)] 



1/2 



In this situation, for any / G (Af — ^M){F). there exists some 1 < i < ?^£ such 
that 



7? 



1/2 



< P{M) 



We conclude that 

N {{M - tJLM){F)M{ri),eP{M)) <N{FM {tJM^) ,e/2) 

and therefore 

^f{G■{M~^iM){T),e|3{M)) < U {{M - ^M){F),tP{M)) 

< AA(J-,6/2) 

This ends the proof of the lemma. 
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6 Feynman-Kac particle processes 

6.1 Introduction 

In this chapter, we investigate the concentration properties of the collection 
of Feynman-Kac particle measures introduced in section 11.41 in terms of the 
contraction parameters Tk,i{n) and Tk.i{m) introduced in definition 13.31 and in 
corollary 13.11 

In the first section, section 16.21 we present some basic first order decom- 
positions of the Boltzmann-Gibbs transformation associated with some regular 
potential function. 

In section [631 we combine the semigroup techniques developed in chapter [3l 
with a stochastic perturbation analysis to derive first order integral expansions 
in terms of local random fields and and Feynman-Kac transport operators. 

In section lS^ we combine these key formulae with the concentration analysis 
of interacting empirical processes developed in chapter [S] We derive quantita- 
tive concentration estimates for finite marginal models, as well as for empirical 
processes w.r.t. some classes of functions. The final two sections, section 16. 5[ 
and section 16. 6[ are devoted respectively to particle free energy models, and 
backward particle Markov models. 

6.2 First order expansions 

For any positive potential function G, any measures /i and v, and any function 
f on E, we have 

[vI'g(a.)-vI,cH](/) 

= M3:) (/^-'^)(^'>*g(/)) (98) 

with the functions 

d^^aif) := G, (/ - *G(i^)(/)) and G,. := G/,^{G) (99) 

Notice that 

|[vI/g(m) - ^Gii^)] (/)| < 9 Km - ^^)K*g(./))| 

and 

||rf.*G(/)||<gosc(/) with 5:=sup(G(x)/G(2/)) 

It is also important to observe that 

|[vI/g(/.)-*^H](/)| < ^ I(m-^)K*g(/))I 
< g |(M-i^)K*G(/))| 
with the integral operator d'^'i'a from Osc(E) into itself defined by 

d',*G(/) :- G' (/ - *g(^)(/)) and G' := G/\\G\\ 
Using lemma [5?^ we readily prove the following lemma. 
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Lemma 6.1 We let J- be separable collection of measurable functions f : E' ^>- 
M on some possibly different state space E' , and such that \\f\\ < 1 and osc{f) < 
1. For any Markov transition M from E into E' , we set 

d',^GM{J')--={d',^G{M{f)) : /eJ-} 

In this situation, we have the uniform estimate 

sup Af[dl^GM{T),2el3{M)]<M{T,e) (100) 

6.3 A stochastic perturbation analysis 

Mean field particle models can be thought as a stochastic perturbation technique 
for solving nonlinear measure valued equations of the form 

Vn = $n (??n-l) 

The random perturbation term is encapsulated into the sequence of local random 
sampling errors (V^„ )n>o given by the local perturbation equations 



One natural way to control the fluctuations and the concentration properties of 
the particle measures {rj^ , 7^) around their limiting values (r/n, 7„) is to express 
the random fields {W^'^ ,W^'^) defined by 

In — In ^ r— y* n 'In ~ 'In ^ r— ^^ n 



in terms of the empirical random fields (V^^)„>o. 

As shown in (j67p . it is important to recall that the local sampling random 
fields models V^ belong to the class of empirical processes we analyzed in 
section 14.11 The stochastic analysis developed in chapter 2] applies directly to 
these models. For instance, using theorem 14. II we have the quantitative almost 
sure estimate of the amplitude of the stochastic perturbations 

IE(|Kr(/)|">^i)'^'"< bim) (101) 

for any m > 1 and any test function / £ Osc(£'„) 

The first order expansions presented in the further development of this sec- 
tion will be expressed in terms of the random functions (ij^„(/) and G'^„, and 
the first order functions (ip.„(/) defined below. 

Definition 6.1 For any < p < n, and any function f on En, we denote by 
dp^nif) the function on Ep defined by 

For any N > 1, and any < p < n, we also denote by Gp^j dp^if), and 
d'Siiif) the Qp_ I -measurable random functions on Ep given by 

^^" ''^ %{ri^.^S{Gp,n) '^^"^■^^ ''^ ^*.«_J*G„„(Pp,„(/)) 
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and 

Notice that 

IIG^Jl < gp,n and !|dp,„(/)|| V ||<„(/)|| < 5p.n /3(Pp.„) 

as well as 

\\d';^M)\\ < PiPp,n) and osc(d;^„(/))<2/3(P^,„) 

As promised, the next theorem presents some key first order decompositions 
which are the progenitors for our other results. Further details on these ex- 
pansions and their use in the bias and the fluctuation analysis of Feynman-Kac 
particle models can be found [191 [531 HSl HH] • 

Theorem 6.1 For any < p < n, and any function f on En, we have the 
decomposition 

W^n^'^a) = Eziv7iArT^p'^«n(/)) (102) 

p=o h y^p,n) 
and the Lm-mean error estimates 

E {\W2^'' ifT)"^ < 2 b{m) ri,i(n) (103) 

with the parameter Ti.i(n) defined in i50\) . 
In addition, we have 

n 

w:r{f) = E< id,Af)] + -= Rnif) (104) 

p=0 ^ 

with a second order remainder term 

n— 1 _. 

such that 

ml l/'^ 

sup E \R^{f)\ <4 6(2m)V2,i(n) (105) 

/eOsc(£;„) 

with the parameter T2,i(n) defined in i50\) . 
Proof: 

The proof of (|102p is based on the telescoping sum decomposition 

n 
p=Q 

Recalling that 

$p,n(A*)=*Gp.„(A*)-Pp,n 
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we prove that 
Using (PSJ) . we have 



= v: 



AT 



^*p(<_i)*Gp,„(/) 



The proof of p04p is based on the telescoping sum decomposition 

n 

^n ~ Vn — 2_^ i^p Qp,n ~ Vp-lQp-l,n\ 
p=0 

with the convention ij^iQ^i „ = ??oQo « = Vn, for p = 0. Using the fact that 
we prove that 

K - Vn] (/) = E [^?p^ - % «_l)] Q,,„(/) + C(/) 
p=0 

with the second order remainder term 

n 

p=i 

Replacing / by the centred function (/ — ??«(/)), and using the fact that 

1 = 7?p_i(Gp_i) and r]p [dp,„(/ - ?7n(/))] = 
we conclude that 

n 
p=0 

with the second order remainder term 

n 

Rnif) := J2[Vp-^-Vp-i](Gp-i) 

X [*G,_, «-l) - *G,_, (?/p-l)] (A/p (dp,„(/))) 



p=l 



^p-il'-^p-iy 
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Finally, we observe that 

d,,_,*G,_,(Mp(dp,„(/))) 



Gp-i 



np-i{Gp^i) 

Gp-i 



Mpidp,M)) 



rip-i{Gp-i) 

Qp-l,pidp,nif)) = Qp-l,nif - '/"(/)) 
dp-l,n{f) 



This ends the proof of (fTUi)) . 

Combining (|102p with the almost sure estimates 

K{d^,Mn))\ ■ K-i) < 26(m)||G^J| /3{PpA 
for any m> 1, we easily prove p03p . Using the fact that 



1 



inf.G^Jx; 



E 



->N 



^p(G;.J| \Qp^i <2bim)gp,„ 



l/n 



and 



E 



!/■ 



C [<«(/)] I \Sp-i < 2 6(m) gp,„ /3(P,,„) 



we prove (|105p . This ends the proof of the theorem. 

6.4 Concentration inequalities 
6.4.1 Finite marginal models 



In section loTm dedicated to the variance analysis of the local sampling models, we 
have seen that the empirical random fields V^ satisfy the regularity conditions of 
the general interacting empirical process models V{Xn) presented in section [??T] 
To be more precise, we have the formulae 

n n 

p=0 p=0 

with the functions 

Sp,n{f) = dp,nU)/ap e Osc(^p) n Bi{Ep) 
for any finite constants 

ap>2 sup [gp.nP{Pp,n)) 
0<p<n 

Therefore, if we fix the final time horizon n, with some slightly abusive 
notation we have the formulae 

n n 

Y.^p''[dpAf)]^Y. ^pV{Xp)[fp] 

p=0 p=0 
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with 

^p = ^p = i^p) l<^<N ^'P = ^^P a«=< and fp = Sp^nif)/a*„ 

— — 0<q<n 

We also notice that 



< 



4 



(^ ^; imp."(/)ii 



and therefore 



E (n-"^ [SpA.f)f \Qp^l) <7^^l 9lnP{Pp,n? 



with the uniform focal variance parameters a^ defined in (j64p . 

This shows that the regularity condition stated in (|!I5|) is met by replacing the 
parameters ap in the variance formula (|95p by the constants 2apgp_n(3{Pp,n) / o-tn 
with the uniform local variance parameters ap defined in (j64p . 

Using theorem 15.11 we easily prove the following exponential concentration 
property. 

Theorem 6.2 ((44j) For any n > 0, any f G Osc(iJ„), and any N > 1, the 
probability of the event 

K - Vn] (/) < ^^^ (l + {L^or' (^)) + 26„ ^l {L\)- 



is greater than 1 — e~^ , for any x > 0, with 

" 0<p<n 

for any choice of bn > K{n). In the above display, T2_i{n) and K{n) stands for 
the parameter defined in I150\) . and Un is the uniform local variance parameter 
defined in \64-^ . 

We illustrate the impact of this theorem with two applications. The first 
one is concerned with regular and stable Feynman-Kac models satisfying the 
regularity conditions presented in section [2131 The second one is concerned with 
the concentration properties of the genealogical tree based models developed in 
section \\M 

In the first situation, combining corollary 13.11 with the estimates (j85p and 
(|86p we prove the following uniform concentration inequalities w.r.t. the time 
horizon. 

Corollary 6.1 We assume that one of the regularity conditions Hin(G,M) 
stated in section \8.4-.l\ is met for some m>Q, and we set 

2 

Pm{x) = 4r2,i(?«) (l + 2(a: + ^/x)) + - K(m) x 
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and 

qm{x) = \/ 8a-'^T2 2{m) X with a =supcr„ 
* ' n>0 

In the above displayed formula, T2. 2(7^1) and 'K{m) stands for the parameters 
defined in corollary \3.1[ and cr„ is the uniform local variance parameter defined 

in ( fS^P - 

In this situation, for any n>Q, any f G Osc(£'„), any N > 1, and for any 
X > 0, the probability of the event 

[Vn - Vn] if) < -^ Pniix) + — = Qmix) 



is greater than 1 — e ^ . 

In the same vein, using the estimates (|55|) and (|55)) . concentration inequah- 
ties for genealogical tree models can be derived easily using the estimates ([55]). 

Corollary 6.2 We let rj^ be the occupation measure of the genealogical tree 
model presented in 16'^)). We also set a^ = sup„>Q(T,^j, the supremum of is the 
uniform local variance parameters af^ defined in W^, and 

Pn,,n{x) = 4(x™.9'")' (1 + 2{x + ^/^)) + - f^ X 

6 [n + 1) 

and 

(lm{x) = (Xm5™) VSo^ 

In this situation, for any n > 0, any £„ G Osc(En), and any N > 1, the 
probability of the event 



N ^i/nx^'^ + l , ^ , ln + 1 



[Vn - Qn] (fn) < — ^ Pn,mi^) + V " ^ 9™(^) 

is greater than 1 — e^^, for any x > 0. 

6.4.2 Empirical processes 

The main aim of this section is to derive concentration inequalities for parti- 
cle empirical processes. Several consequences of this general theorem are also 
discussed, including uniform estimates w.r.t. the time parameter, and concen- 
tration properties of genealogical particle processes. 

Theorem 6.3 We let Tn be a separable collection of measurable functions /„ 
on En, such that \\f„\\ < 1, osc(/„) < 1, with finite entropy I{J-n) < 00. 

with the parameter ri_i(n) defined in 150)) and 



c^„ < 242 / v/log(8+AA(J-„,e)2) de (106) 

Jo 

In particular, for any n > 0, and any N > 1, the probability of the following 
event 

sup \Tlnif) - V7i{I)\ < -^ naN Vx + log 2 

is greater than 1 — e^^ , for any x > 0. 
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Proof: 

Using (|102p . for any function /„ £ Osc(£'„) we have the estimate 

n 

|M^„^^^(/n)| < 2 J29p,nP{Pp,n) |^p^«„(/»))| 
p=0 

with the t^^]^ -measurable random functions Sp^{fn) on Ep defined by 

1 

2/3("Pp,n) 

By construction, we have 

||'^^„(/«)||<l/2 and 5^„(/„) e Osc(Sp) 
Using the uniform estimate (|100p , if we set 

Op,n--=S^J^n)^{5^.M) : /eJ-4 
then we also prove the almost sure upper bound 

supAA[^p^„,e] <AA(J-„,e/2) 

N>1 

The end of the proof is now a direct consequence of theorem 15.21 This ends the 
proof of the theorem. ■ 



Corollary 6.3 We consider time homogeneous Feynman-Kac models on some 
common measurable state space En = E. We also let J- be a separable collection 
of measurable functions f on E, such that \\f\\ < 1, osc(/) < 1, with finite 
entropy I{J-) < oo. 

We also assume that one of the regularity conditions Hm(G,]V[) stated in 
section \3.4-l\ is met for some m > 0. In this situation, for any n > 0, and any 
N > 1, the probability of the following event 



sup \Vnif)-Vnif)\ < -J= Ti,i(m) y/x + log 2 



is greater than 1 — e ^, for any x > 0. 

In the same vein, using the estimates (|58p. we easily prove the following 
corollary. 

Corollary 6.4 We also assume that one of the regularity conditions Hjn(G, M) 
stated in section \3.4-.l\ is met for some to > 0. 

We let Tn be a separable collection of measurable functions £„ on the path 
space En, such that ||fn|| < 1, osc(f„) < 1, with finite entropy I{Tn) < oo. 

We also let rj^ be the occupation measure of the genealogical tree model pre- 
sented in J6'ii|) . In this situation, for any n > 0, and any N > 1, the probability 
of the following event 



sup |77,^(f„) - Q„(f„)| < ^ (n + 1) x™.9™ V^+1^ 



is greater than 1 — e ^ , for any x > with the constant cjr^^ defined in ilOb]) . 



IS 
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The following corollaries are a direct consequence of ([75)) . 

Corollary 6.5 We assume that the conditions stated in corollary \6.3\ are sat- 
isfied. When T stands for the indicator functions ^74\ l of cells in E = M'', for 
some d > 1, the probability of the following event 



sup|??^(/)-r/„(/)| <c ri,i(m) J— (x + 1) 
feJ^ V iV 

is greater than 1 — e~^ , for any x > 0, for some universal constant c < oo that 
doesn't depend on the dimension. 

Corollary 6.6 We assume that the conditions stated in corollary \6.4\ are satis- 
fied. When Tn stands for product functions of indicator of cells |7^[ ) in the path 
space En = (M''" x . . . , xM'*"), for some dp > 1, p > 0, the probability of the 
following event 



sup |,7^(f„) - Q„(f„)| < c (n + 1) xrag"" J^^M^^ [x + 1) 

is greater than 1 — e^^, for any a; > 0, for some universal constant c < oo that 
doesn't depend on the dimension. 

6.5 Particle free energy models 

6.5.1 introduction 

The main aim of this section is to analyze the concentration properties of the 
particle free energy models introduced in section 11.4.21 More formally, the 
unnormalized particle random field models discussed in this section are defined 
below. 

Definition 6.2 We denote by'^^ the normalized models defined by the following 
formulae 

ln{f)^lnU)hn{t)^€{f) I] ^P^(^p) 

0<p<n 

with the normalized potential functions 

AT 7.A^ 

We also let W^/ and W^^ be the random field particle models defined by 

W2-''' = Vn [7,r - 7n] and Wl'"" := W^'" if)hnit) 

These unnormalized particle models j^ have a particularly simple form. 
They are defined in terms of product of empirical mean values rj^ (Gp) of the 
potential functions Gp w.r.t. the flow of normalized particle measures 77^ after 
the p-th mutation stages, with p < n. 

Thus, the concentration properties of 7^ should be related in some way to 
the one of the interacting processes rj^ developed in section 16.41 

RR n° 7677 



On the concentration properties of Interacting particle processes 



98 



To begin with, we mention that 



0<p<n ^ ^^ 



For more general functions we also observe that for any function / on En, s.t. 
rjnif) = 1, we have the decompositions 






We readily deduce the following second order decompositions of the fluctuation 
errors 



Wl'^'if) = 



w'f^x) + M^„^'^(/)] + -^ (m/:^(i) wi-^'i!)) 



-N , 



This decomposition allows to reduce the concentration properties of W ^ (/) 

to the ones of W^'^(/) and W^'^(l). 

In the first part of this section, we provide some key decompositions of 
W2,'^ in terms of the local sampling errors V^ , as well as a pivotal exponential 
formula connecting the fluctuations of the particle free energies in terms of the 
fluctuations of the potential empirical mean values. 

In the second part of the section, we derive first order expansions, and 
logarithmic concentration inequalities for particle free energy ratios 7„ (1) = 
7^(1)/7«(1). 

6.5.2 Some key decomposition formulae 

This section is mainly concerned with the proof of the following decomposition 
theorem. 

Theorem 6.4 For any < p < n, and any function f on En, we have the 
decompositions 



W2'"if) 



^7p^(l)^p^(Qp,n(/)) 



p=0 



p=0 

with the normalized Feynman-Kac semigroup 

Qp,n{.n = QpAf)/vpQpA^) 

In addition, we have the exponential formulae 

wru) 



(107) 
(108) 



W^'^iGp) 
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Proof: 

We use the telescoping sum decomposition 

n 

In -In^Yl {lpQp,n " Jp_iQp-l,n) 
p=0 

with the conventions Qn.n ~ Id, for p = n; and j^iQ-i.n ~ 7oQo,nj for p = 0. 
Using the fact that 

Ipi^) = Ip-iiGp-i) and 7p^-iQp-i,n(/) = 7p^-i(Gp-iMp(gp,„(/)) 
we prove that 

7p^_lQp-l.n = 7p^(l) % iVp-l) Qp.n 

The end of the proof of the first decomposition is now easily completed. We 
prove (|108p using the following formulae 

Qp.J/)(-^) - ^Q.,n(/)(^) 

The proof of (|109p is based on the fact that 
for any positive numbers x, y. Indeed, we have the formula 



log(7^(l)/7„(l)) = log(^l + 



N 7n(l) 

^ (log7yp~(Gp)- log r;p(Gp)) 

0<p<n 



This ends the proof of the theorem. 



6.5.3 Concentration inequalities 

Combining the exponential formulae (|109p with the expansions (jlOSp , we derive 
first order decompositions for the random sequence \/N log 7^(1). These ex- 
pansions will be expressed in terms of the random predictable functions defined 
below. 

Definition 6.3 We let /i^„ be the random Q^_^-measurahle functions given by 

q<-p<n 

with the functions dff [Gp) given in deUnition W . 1[ 
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Lemma 6.2 For any n > and any N > I, we have 



iogC(i)= Y. y<^K^) + ^Rn 



N logT^^l) = }^ K," iK.n) + ^^ K (110) 

0<g<n 

with a second remainder order term R^ such that 

^[\Rn\) <h{2mfr{n) 
for any m>l, with some constant 

r{n) <?> Y, 9p {2gp ri,i(p)' +T34(p)) 

0<p<n 

Proof: 

Using the exponential formulae (jl09p . we have 

ViV log 7,^^(1) = VNlogl^l + ^Wl'^'il)^ 

Ni 



This implies that 






- dt 

) 



Vn log 7^(1) = Y. w;^'''iGp) + -j= Ry 

with the (negative) second order remainder term 

dt 



0<-D<n "'O 1 + TTv '^P y^P 



On the other hand, using (|108p we have 



E %""(Gp) = E ^" Kn) + -^ Ri 



jN,2 
0<p<n 0<g<n 

with the second order remainder order term 



0<<j<p<n '9 '-'-''J.P'' 

This gives the decomposition (jllOp . with the second remainder order term 






Using the fact that 



1 + -^ W;'^{G,) = t T^^iGp) + il-t)>t g- 
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for any t g]0, 1], with g^ := iidx Gp{x), we find that 



0<p<n ^P 



Using (|103p . we prove that 

l/r 



E 



(|ii^Y) < 4&(2rf 5: ^osc(G,)%Mbf 



0<p<n ^P 



from which we conclude that 

0<p<n 

In much the same way, we have 



E{\R^'^\'Y''<m2r)r E 9l-^Apf 



E{\R^'^\Y' 



l/Sr /, ,,,„,_ ,-,,2r\l/2r- 



and using (jlOip . we prove that 

This ends the proof of the lemma. ■ 

We are now in position to state and to prove the following concentration 
theorem. 

Theorem 6.5 For any N > 1, e E { + 1, —1}, n > 0, and for any 

<;* > sup ^,.„ with <jq,„ := - > gq^pgp l3{Pq^p) 

the probability of the following events 



Lxog^^(t)<-r{n) {l + {Ll)-\x))+,lal {L\)- 



is greater than 1 — e~^, for any x >Q, with the parameters 

^n := E '^9 (<^9,»/'^n)^ and r(n) == r(n)/n 

Q<q<n 

Before getting into the proof of the theorem, we present simple arguments 
to derive exponential concentration inequalities for the quantities 7„ (1) — 1 • 
Suppose that for any e e {+1,-1}, the probability of events 

^ log 7^(1) <p^(x) 
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is greater than 1 — e^^, for any x > 0, for some function p^ such that 

Pnix) ~)-N^oo 

In this case, the probabihty of event 

- (l - e-""" (^)) < 7^(1) - 1 < e""" (^) - 1 

is greater than 1 — 2e~^, for any x > 0. Choosing N large enough so that 
Pn {^) — l/'^ '^6 have 

-2np^{x) < - (l - e-""" (")) and e""" ("^ - 1 < 2np^{x) 

from which we conclude that the probability of event 

V{\-f^{l)-l\<2np^ix))>l-2e'^ 

Now, we come to the proof of the theorem. 
Proof of theorem 16. 5t 

We use the same line of arguments as the ones we used in section 16.4.11 
Firstly, we observe that 

WuN II ^ V^ II .TV /7=7 x|| 

\\n'q,n\\ S 2^ \rq.p\^P)\\ 

q<~p<n 

^ X] S-J.P OSc(Pq,p(Gp)) < 2 X 5,,p5p l3{Pq,p) = Cqji/2 

q<p<i7i q<.p<7i 

and osc{h^^) < Cq^n- Now, we use the following decompositions 



0<q<n 0<q<n 

with the tj^j^ -measurable functions 

<„=<n/«n€Osc(i?,)nSl(i?,) 

and for any constant a* > suPq< <„ Cg^„. 

On the other hand, we have the almost sure variance estimate 

E (^f Kn] " ICi ) < -'q o.c{hiy/al' < alclJal' 
from which we conclude that 

This shows that the regularity condition stated in (|95|) is met by replacing the 
parameters aq in the variance formula (j95p by the constants UqCq^n/a^^ with the 
uniform local variance parameters Up defined in (j64p . 

The end of the proof is now a direct consequence of theorem 15.11 This ends 
the proof of the theorem. ■ 
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Corollary 6.7 We assume that one of the regularity conditions Hm(G,M) 
stated in section lS.4-l\ is met for some m > 0, and we set 

Pm{x) ■■= ci{m) (^1 + 2{x + y/x)) + C2{m) X and q„i{x) = cs{m)^/x 

with the parameters 

ci(m) = (4gTi,i(m))^ +8.9x3,1(771) 

C2(m) = 4(/3™(7"+i)/3 and c^im) ^ 4g^2T 2,2 {m)a^ 

In the above displayed formula, T2^2{'m) and k(to) stands for the parameters 
defined in corollary \3.1[ and an is the uniform local variance parameter defined 
in JS^ . 

In this situation, for any N > 1, and any e £ {+1,-1}, the probability of 
each of the following events 

- log7i^(l) < — Prnix) + —= qm{x) 

is greater than 1 — e^^, for any x > 0, 
Proof: 

Under condition Hm(G,M), we have 

r{n)/n < {4gTi^i{m))'^ + 83x3,1(771) 
and for any p < n 



43^^ 



\ ' p<q<n J 



{Agf {n - p) 



n n ^-^ ' ^ 

p<q<n 

This imphes that 



E 4,n < ^ E -2.2(9) < (4.9)^r,,,(7 



n 

0<p<n 0<q<n 



In much the same way, we prove that <;^ < 4/3^5™+^ . The end of the proof is 
now a consequence of the estimates ([55)) and ([55)) . This ends the proof of the 
corollary. ■ 



6.6 Backward particle Markov models 

This section is concerned with the concentration properties of the backward 
Markov particle measures defined in (J15p . Without further mention, we assume 
that the Markov transitions M„ satisfy the regularity condition ([7]), and we 
consider the random fields defined below. 
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Definition 6.4 We let W^'^ and W^''^ be random field models defined by 
<-^ = Vn (r,^ - r„) and W^'"" = Vn (Qir - Q„) 

The analysis of the fluctuation random fields of backward particle models 
is a little more involved than the one of the genealogical tree particle models. 
The main difficulty is to deal with the nonlinear dependency of these backward 
particle Markov chain models with the flow of particle measures rj^ . 

In section I6.6.1[ we provide some preliminary key backward conditioning 
principles. We also introduce some predictable integral operators involved in the 
first order expansions of the fluctuation random fields discussed in section [B.6.3l 
In section 16.6.21 we illustrate these models in the context of additive functional 
models. In section [5.6.41 we put together the semigroup techniques developed 
in earlier sections to derive a series of quantitative concentration inequalities. 

6.6.1 Some preliminary conditioning principles 

By definition of the unnormalized Feynman-Kac measures r„, we have 

Tnid{xo,...,Xn)) = Tp{d{xo,...,Xp)) Tn\piXp,d{Xp+i, . . . ,Xn)) 

with 

p<q<n 

This implies that 

Qn{d{xo,.-.,Xn)) = Q„,p(d(a;o,...,a;p)) x Q„|p(a;p,d(a;p+i, . . . ,a;„)) 
with the Q„-distribution of the random states {Xq, . . . , Xp) 

Qn,p{d'{xo,---,Xp)) := — -—, — - Qp{d{xo,...,Xp)) Gp^n{xp) 

Vp \^p,n) 

and the Q„-conditional distribution of (Xp+i, . . . ,X„) given the random state 
Xp = Xp defined by 

^n\p[Xp, U[Xp-^-i, . . . , Xn)) = — /-ri\f n' -'- n\p(Xp7 Cl[Xp-\-i, . . . , Xn)) 

Now, we discuss some backward conditioning principles. Using the backward 
Markov chain formulation ([5]), we have 

Qn{d{xo,...,Xn)) = r]n{dxn) Q„|„ (a;„ , d(xo, . . . , x„_i)) 

with the Q„-conditional distribution of {Xq, . . . , X„_i) given the terminal ran- 
dom state Xn = Xn defined by the backward Markov transition 

n 

Qn\n{xn,d{xo, . . . , a;„-i)) := Y[ Mg,^^_^ (xg, dxg^i) 

9=1 

By construction, the Q^-conditional distribution of (Xo, . . . ,X„_i) given 
the terminal random state X„ = Xn is also defined by the particle backward 
Markov transition given by 



^N 



\^{Xn,d{xo,...,Xn-l)) := Y[Mg.^N_^{Xg, dXg^i) 

9=1 
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We check this claim, using the fact that 

Q^{d{xo, ..., x„)) = Vnidxn) Q^„(a;„, d{xo, ..., x„_i)) 

Definition 6.5 For any < p < n and N > 1, we denote by D^^ and L^^ 
the Q^_ I -measurable integral operators defined by 



and 



Dp^n{xp,d{yo,...,yn)) 

■= Qp\pixp, d{yo, . . . ,yp_i)) 5x^{dyp) r„|p(a;p, d(yp+i, . . . , ?/„)) 

Lp^n{xp,d{yQ,...,yn)) 

■^ 'Qplpi^P' d{yo, . ■ . ,yp--i)) S^^idyp) Qnip{xp,d{yp+i,...,yn)) 

For p e {0, n}, we use the convention 

_/ 

'r. 

^ni^n,d{yQ, ..., y„_i)) 4„ (dyn) 



(111) 



(112) 



D^,nixn,d{yo,...,y„)) = L^„(x„, %o, . . . , y„)) 



and 

Do^nixQ,d{yo,...,yn)) = S^„{dyo) ^n\o{xQ,d{yi, . . . ,yn)) 
L^,^{xQ,d{yo,...,yn)) = 4o(c^yo) Qn|o(2;o,c^(yi, ■ • ■ ,y«)) 

The main reason for introducing these integral operators comes from the 
following integral transport properties. 

Lemma 6.3 For any < p < n, and any N > I, and any function f„ on the 
path space E„, we have the almost sure formulae 

VpD^^^ = <(Gp) X $p+i(r;f X+i,„ (113) 



'Ip ^p,n\'-n) / ATX j-N (f \ 



(114) 
= *G,^,.„($p+l(^f))i^+l,„(fn) (115) 

Proof: 

We check (|113p using the fact that 

r„|p(a;p,rf(yp+i,---,yn)) 

(116) 
= Qp+i {xp, rfyp+i )r„|p+i (xp+i , d{yp+2, ■■■,yn)) 

and 

Vp {dxp)Qp+i{xp,dyp+i) = r^p Qp+i{dyp+i) x Mp+i^,,N(2/p+i, dxp) (117) 
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More precisely, we have 

Vpidxp)D^.^{xp, d{yo, ..., y„)) 

:= ij^ {dxp)Qp+i{xp, dyp+i)Q^^j^{xp, d{yo, . . . ,2/p_i)) 

X Sj,^{dyp) Tr,\p+i{yp+i,d{yp+2, ■ . ■ ,yn)) 
Using (|117p . this iinphes that 

Vp idxp)D^,^{xp, d{yo, ..., y„)) 

:= VpQp+i{dyp+i) M[p+i^^«(j/p+i,da;j,)Q^p(a;p,d(yo, • ■ • ,2/p-i)) 

X Sj,^{dyp) r„|p+i(yp+i,d(yp+2,---,2/n)) 
from which we conclude that 

'Ip ^p,n\'-n) 

■= I VpQp+iidyp+i) Q^+i|p+i(yp+i,%o,---,yp)) 

X r„|p+i(2/p+i,(i(2/p+2,...,2/„)) fn(yo,-- •,?/«) 

= {v^Qp+i)D^_^,Jt^) 

This ends the proof of the first assertion. Now, using (J113p we have 



<D^„(1) %+,{rj^)D^^,Jl) 



^^„(l)=Qp,n(l)=Gp,„ 



Recalling that 



we readily prove pi4p and (|115|) . This ends the proof of the lemma. 



6.6.2 Additive functional models 

In this section we provide a brief discussion on the action of the operators -D^„ 
and Lpj^ on additive linear functionals 



p=0 

associated with some collection of functions /„ G Osc(£'„). 



(118) 



^p,n\'-'^ 



^p.7l\^ } 



E K^<-. 



•^g+i.')^ 



0<q<p 
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with triangular array of Markov transitions Rp^q introduced in definition 13.11 
By definition of L^„, we also have that 

0<q<p P^Q^n 

using the estimates (j53l) , we prove the fohowing upper bounds 

osc(L^„(f„)) 



There are many ways to control the Dobrushin operator norm of the product 
of the random matrices defined in (J16p . For instance, we can use the multiplica- 
tive formulae 

p<k<q 

One of the simplest way to proceed, is to assume that 

H,,{x,y)<T Hn{x,y') (119) 

for any x,y,y', and for some finite constant r < oo. In this situation, we find 
that 

from which we conclude that 



/3(Mfe+i,,«) 



< 1 - T-2 



We further assume that the condition Hm(G, M) stated in section [3.4.1l is met 
for some m > 1. In this situation, we have 

osc(L^„(f„)) 

- l^O<q<p \^ ^ ) ^ Xmy l^p<q<n l^ » Xm ) 

from which we prove the following uniform estimates 

sup osc (i^„(f„)) <T'+m g'"'-\l (120) 

0<p<Tl 

6.6.3 A stochastic perturbation analysis 

As in section [^31 we develop a stochastic perturbation analysis that allows to ex- 
press W^'^ and W^'''^ in terms of the local sampling random fields (Vl^)o<p<n- 
These first order expansions presented will be expressed in terms of the first 
order functions dy'^cif) introduced in ([M]) . and the random 5£*'^j -measurable 
functions G^ introduced in definition 16.11 



Definition 6.6 For any N > 1, any < p < n, and any function £„ on the 
■ 'ip'n(fn) be the Qp_i-measurable fun 

d^.n(fn)=d*,(,«_^)*G_(i^„(f„)) 



path space £„, we let d?'jj(fn) be the Q^_i-measurahle functions 
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We are now in position to state and to prove the following decomposition 
theorem. 

Theorem 6.6 For any < p < n, and any function £„ on the path space E"~^^ , 
we have 

E(r^(f„)|t;^^)=7p^«Jfn)) (121) 

In addition, we have 

n 

M^„"'"^(fn) = E^P^(l)^p''(^p'^"(f")) (122) 

p=0 
n 

^"'''"'(f") = E;^lv77wT^/(d^n(fn)) (123) 

71 

= E^/(d^n(fn)) 
p=0 

" 1 1 

"§ <(GIJ 7^ ^/(^P"^") ^ ^/ (d^n(fn)) 

(124) 
Proof: 

To prove the first assertion, we use a backward induction on the parameter 
p. For p — n, the result is immediate since we have 

r^(f„) = 7^(i)^^«„(fn)) 

We suppose that the formula is valid at a given rank p < n. In this situation, 
using the fact that £)£*'„(£„) is a Cjf^j^-measurable function, we prove that 

(125) 
Applying (|113p . we also have that 

Ip-l'^p-^p.n ^ 7p-l^p-l,n 

from which we conclude that the desired formula is satisfied at rank {p — 1). 
This ends the proof of first assertion. 

Now, combining lemma 16.31 and (|12ip , the proof of the second assertion is 
simply based on the following decomposition 

(r^-r„)(f„) 

= E;=o [e (r,^(fn) \gn-^ (r^(fn) I S^-i )] 

= E;=o7p^(1) f< (^p^„(fn)) - ^ 1^ , V^-i {Dl,^,,{t^))] 
\ Vp-i\^p-i) J 
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To prove the final decomposition, we use the fact that 



}:!^^n 






0<p<r; 

with the conventions rj^^D^^ „ = ?/or„|o; for p = 0. 
Finahy, we use pi4p and (jllSp to check that 



0<p<n 



We end the proof using the first order expansions of BoUzniann-Gibbs transfor- 
mation developed in section IMl 



This ends the proof of the theorem. 



6.6.4 Concentration inequalities 

Finite marginal models 

Given a bounded function £„ on the path space £„, we further assume that 
we have some almost sure estimate 

sup osc (L;^„(f„)) < /p.„(f„) (126) 

Af>l 

for some finite constant ?p,r,(fn) (< i|fn!|)- For instance, for additive functionals 
of the form (|118p , we have proved in section [B.6.2l the following uniform estimates 

osc(i^„(f„))<r2 + mg2'"-ix^ 

which are valid for any A'^ > 1 and any < p < n] as soon as the mixing 
condition Hm(G,M) stated in section 13.4.11 is met for some to > 1, and the 
regularity (|119p is satisfied for some finite r. 

For any additive functional fn of the form (|118p . we denote by f„ = fn/(n+l) 
the normalized additive functional. 

Lemma 6.4 For any N > \, n> 0, and any hounded function £„ on the path 
space En, we have the first order decomposition 

n 
■Q.^(f„) ^ Y: C (d^n(fn)) + -i= <(fn) (127) 
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with a second order remain term i?^(f„) such that 

„, \ l/m 

<(f„)l ) < &(2m)2 r„(f„) 



for any m > I, with some finite constant 



r-„(f„)<4 J2 5p,„W(fn) (128) 

0<p<n 



Proof: 

Firstly, we notice that 



||d^„(f„)|| < ^ 'lf^:"f. . osc (L^^„(f„)) < ff,^„ osc (L^Ji^)) 

Using (|124p . we find the decomposition (|127p with the second order remainder 
term 

n 

^^(fn) = - 2^ i?^„(fn) 
p=0 

with 

-^p,n\^n) '■— ,-.N(r<N \ P \^p,n) ^p v'-'p.nl'^n j j 
Ip \^p,n/ 

On the other hand, we have 

l/(2m) 



< 



g,.„ E{\v;'{G,,J\\G,,„\\)r\g^_,) 



l/(2m) 



Using (jlOip . we prove that 

The end of the proof is now clear. This ends the proof of the lemma. 



Theorem 6.7 For any N > 1, n > 0, and any bounded function f„ on the path 
space En, the probability of the events 



J^-QJ 



(fn)<^ {l + {L*)-\x))+2b^al (Lrr'^j^^ 
is greater than 1 — e^^, for any x > 0, with 

—2 J_ V^ 2 7 (f \2 2 

"■ '^ U2 / , yp,n ^p,n\'-n) CTp 

" 0<p<n 

and for any choice of bn > suPq<„<„ (7p^„ 'p,n(fn)- In the above displayed for- 
mulae, an are the uniform local variance parameters defined in \6Ji-\j , lp.n{^n) 
and ^^(fn) are the parameters defined respectively in hl26\) and hl28\) . 
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Before getting into the proof of the theorem, we present some direct con- 
sequences of these concentration inequahties for normahzed additive functional 
(we use the estimates ([55]) and ([55]) "l. 

Corollary 6.8 We assume that the mixing condition Hm(G,M) stated in sec- 
tion \3.4A\ is met for some m > I, and the regularity ill9\) is satisfied for some 
finite t. We also suppose that the parameters Un defined in {64-\! are uniformly 
hounded cr ~ sup„>Q (t„ < oo and we set 

ci(m) := 25™xm (r' + m g^^-^X™) and c^im) := 2(g'"x™)ci(m) 

In this notation, for any N > I, n > 0, and any normalized additive functional 
f„ on the path space En, the probability of the events 

[Q^ - Q„] (f„) 

C2{m) 



< 



{l + iL*)-'{x))+c,im)a^ (LlY 



is greater than 1 — e^^, for any x > 0. 

Corollary 6.9 We assume that the assumptions of corollary \ 6. 8\ are satisfied, 
and we set 

Pni,n{x) = C2(m)(l + 2{x + y/x)) + „/ , ,. X 

3(n+ 1) 
and 



, . I 2xa 

^{x) = ci(m) 



(n + 1) 

with the constants ci(m) and C2(rn) defined in corollary 
In this situation, the probability of the events 

[Qr'T - Qn] (fn) < ^ PmA^) + ^ <}rn,n{^) 

is greater than 1 — e^^ , for any x > 0. 

Proof of theorem 16. 7t 

We use the same line of arguments as the ones we used in section 16.4.11 
Firstly, we notice that 

||d^„(fn)|| <5p,n IpA^n) 

This yields the decompositions 

n n 

EC «n(fn)) = E-pC ('5^n(fn)) 
p=0 p=0 

with the functions 
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and for any finite constants 

Op > 2 sup gp^n lp,n{in) 
0<p<n 

On the other hand, we also have that 

E (^/ &(fn))' \g^^,) < J^ 4 gin IpAin? 

with a* := supQ< <„ap, and the uniform local variance parameters a^ defined 
in dMD. 

This shows that the regularity condition stated in (^5]) is met by replacing the 
parameters ap in the variance formula (j95p by the constants 2apgp_nlp.nitn) / o-n, 
with the uniform local variance parameters Cp defined in (j64p . 

Using theorem 15. 1[ we easily prove the desired concentration property. This 
ends the proof of the theorem. ■ 

Empirical processes 

Using the same line of arguments as the ones we used in section 16.4.21 we 
prove the following concentration inequality. 

Theorem 6.8 We let Tn be a separable collection of measurable functions fn 
on E„, such that ||fn|| < 1, osc(fn) < 1, with finite entropy I{J-n) < oo. 

n 

with the functional f„ G Tn ^-^ Ip.ni^n) defined in hl26\) and 

c^„ < 242 [ v/log(8+AA(J-„,e)2) de 
Jo 

In particular, for any n > 0, and any N > 1, the probability of the following 
event 

n 
sup |Q^(f„)-Q„(f„)| < % Y.9p,n \\lp,n\\^,. v/2; + l0g2 

is greater than 1 — e^^, for any x > 0. 

Proof: 

Using (jl23p . for any function £„ G Osc(En) we have the estimate 

n 
p=0 

with the t/f^j^ -measurable random functions (5£'„(fn) on Ep defined by 

'^f'P,n V^nj II ^p.n \\ 
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By construction, we have 

||C(fn)||<l/2 and <5^„(f„) e Osc(£;p) 
Using the uniform estimate (jlOOp , if we set 

then we also prove the almost sure upper bound 

supAA[ep^„,e] <AA(J-„,e/2) 

Af>l 

The end of the proof is now a direct consequence of theorem 15.21 and theo- 
rem 221 This ends the proof of the theorem. ■ 



Definition 6.7 We let J- = (-^n)„>07 ^^ '^ sequence of separable collections Tn 
of measurable functions /„ on En, such that \\fn\\ < 1, osc(/„) < 1, and finite 
entropy I{J-n) < oo. 

For any n > 0, we set 

J„(J-) ~ 242 sup / ./log(8+AA(J-„e)2) de 

Q<q<nJo * 

We also denote by I]„(J-') the collection of additive functionnals defined by 
^niJ^) = {in e S(E„) such that V x„ = (xq, . . . , a;„) e E„ 

fn(x„) = J2p=o fpi^p) ^it^^ fp £ -^P' foi" < p < n| 

and 

S„(^) = {fJ(n+l) : f„eE„(J-)} 

Theorem 6.9 For any A^ > 1^ and any n > 0, we have 

n 

^^ (l|W^n^''^llE„(^)) ^«" '^"(•^) E 5p.n 

p=0 

wii/i some constant 

and for any a collection of [0, l]-valued parameters pp^q such that 
VO<q<p sup /3 (Mp_„« M,+i.„« ) < /3p., 

Proof: 

By definition of the operator d^„ given in definition l6.6l for additive functionals 
fn of the form (|118p , we find that 

C„(fn) := $,(7yf_0(Gp,„) X d^^„(f„) 

= E <q'^(A)+ E <.^^(/.) 

0<(j<p p<.q<.n 
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with 






and 



P,9 P,Vp-i 



h+i:V^ 



This imphes that 

|^p^(d;,^n(fn))| 



^P,n| 



with 



"p,q,n Ugy 



2/3 



1 Gp,„ 



Gp,„ [Id - *G,.„ ($p«-i))] M^:V(/,) 



and 






^p.n 



2/} (4"^) WG, 



^ Gp,„ [/d- vi/c^ _^ ($p«-i))] R'M,) 



(129) 



By construction, we have that 

\s^^^l{f,)\\<l/2 and osc (j(';^q:) (/,))< 1 
for any i G {1, 2}. We set 

='p,q,n ■" "p.q.nl-' g,* ^ |"p,q,nWq/ ■ Jq *= -' 9 j 

Using the uniform estimate (jlOOp . we also prove the ahnost sure upper bound 

supAAte^),el <AA(J-„e/2) 

Using theorem 15.21 we prove that 

ir^ (supf„^s„(^) l^p"^ «„(fn))|) < a« l|Gp,„|| J„(^) 

The end of the proof of the theorem is now a direct consequence of the decom- 
position (|123p . This ends the proof of the theorem. ■ 
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Corollary 6.10 We further assume that the condition Hin(G,M) stated in 
section \8.4-.l\ is met for some m>l, the condition hll9\) is satisfied for some 
T, and we have J{T) := sup„>o Jn{^) < oo- In this situation, we have the 
uniform estimates 

with some constant 

In particular, for any time horizon n > 0, and any N > \, the probability of the 
following event 

lie - Qn|ls„(^) < ^ C'A^) v/a^ + log2 

is greater than 1 — e^^, for any x >Q. 

Proof: 

When the conditions Hm(G, M) and (|119p are satisfied, we proved in sec- 
tion [1X2] that 



E /^p.?+ E p[Rt:i)<^^+'^9 



2m- 13 
Am 



0<q<p V^^^l^ 



We end the proof of the theorem, recahing that gp_„ < Xm.g™' ■ This ends the 
proof of the theorem. ■ 

We end this section with a direct consequence of ([75]) . 

Corollary 6.11 We further assume that the condition Hin(G,M) stated in 
section \8.4-l\ is met for some m > 1, the condition ill9\) is satisfied for some r. 
We let Tn be the set of product functions of indicator of cells in the path space 
En = W^, for some d>l,p>0. 

In this situation, for any time horizon n > 0, and any N > 1, the probability 
of the following event 



\\Qn-Qn\y^^^^< C{m) \J^{X + ^) 

is greater than 1 — e~^, for any x > 0, with some constant 

c{m)<cxm9"' (r^+mg^—i^s^) 
In the above display, c stands for some finite universal constant. 
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